我尝试在分布式缓存中存储本地文件。 该文件存在,但我得到未发现异常FileNotFound尝试在hadoop分布式缓存中存储文件时出现异常
的代码片段文件:
DistributedCache.addCacheFile(new URI("file://"+fileName), conf);
RunningJob job = JobClient.runJob(conf);
例外:
Error initializing attempt_201310150245_0066_m_000021_0:
java.io.FileNotFoundException: File /Workflow/data does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:468)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:380)
at org.apache.hadoop.filecache.TaskDistributedCacheManager.setupCache(TaskDistributedCacheManager.java:180)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1454)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1445)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1360)
at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2786)
任何想法?
发现是该文件确实在'/工作流/ data'或者是在'/ somepath /工作流/ data'? – cabad
这很可能是URI需要是hdfs吗? –
@Ophir,我面临同样的问题。我已经确认该文件存在于HDFS中,但仍然出现此问题。你是如何解决你的问题的? – Shekhar