2015-03-25 36 views
1

我刚刚开始集成RHadoop。它集成了R-studio服务器和Hadoop,但是在运行map-reduce作业时出现错误。当我运行下面的代码行时。在R中运行地图缩减作业时发生错误

library(rmr2) 
a <- to.dfs(seq(from=1, to=500, by=3), output="/user/hduser/num") 
*b <- mapreduce(input=a, map=function(k,v){keyval(v,v*v)})* 

堆栈跟踪:

15/03/24 21:13:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.5.0-cdh5.2.0.jar] /tmp/streamjob4788227373090541042.jar tmpDir=null 
15/03/24 21:13:48 INFO client.RMProxy: Connecting to ResourceManager at tungsten10/192.168.0.123:8032 
15/03/24 21:13:48 INFO client.RMProxy: Connecting to ResourceManager at tungsten10/192.168.0.123:8032 
15/03/24 21:13:49 INFO mapred.FileInputFormat: Total input paths to process : 1 
15/03/24 21:13:50 INFO mapreduce.JobSubmitter: number of splits:2 
15/03/24 21:13:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1427104115974_0009 
15/03/24 21:13:50 INFO impl.YarnClientImpl: Submitted application application_1427104115974_0009 
15/03/24 21:13:50 INFO mapreduce.Job: The url to track the job: http://XXX.XXX.XXX.XXX:8088/proxy/application_1427104115974_0009/ 
15/03/24 21:13:50 INFO mapreduce.Job: Running job: job_1427104115974_0009 
15/03/24 21:14:02 INFO mapreduce.Job: Job job_1427104115974_0009 running in uber mode : false 
15/03/24 21:14:03 INFO mapreduce.Job: map 0% reduce 0% 
15/03/24 21:14:07 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000000_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:08 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000001_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:15 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000001_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:16 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000000_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:20 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000001_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:21 INFO mapreduce.Job: Task Id : attempt_1427104115974_0009_m_000000_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 

15/03/24 21:14:25 INFO mapreduce.Job: map 100% reduce 0% 
15/03/24 21:14:26 INFO mapreduce.Job: Job job_1427104115974_0009 failed with state FAILED due to: Task failed task_1427104115974_0009_m_000001 
Job failed as tasks failed. failedMaps:1 failedReduces:0 

15/03/24 21:14:26 INFO mapreduce.Job: Counters: 13 
    Job Counters 
     Failed map tasks=7 
     Killed map tasks=1 
     Launched map tasks=8 
     Other local map tasks=6 
     Data-local map tasks=2 
     Total time spent by all maps in occupied slots (ms)=27095 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=27095 
     Total vcore-seconds taken by all map tasks=27095 
     Total megabyte-seconds taken by all map tasks=27745280 
    Map-Reduce Framework 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
15/03/24 21:14:26 ERROR streaming.StreamJob: Job not Successful! 
Streaming Command Failed! 
**Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : 
    hadoop streaming failed with error code 1 
15/03/24 21:14:30 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes. 
Moved: 'hdfs://XXX.XXX.XXX.XXX:8020/tmp/file10076f272b9a' to trash at: hdfs://XXX.XXX.XXX.XXX:8020/user/hduser/.Trash/Current** 

我搜索了很多解决这个问题,但解决方案还没有找到。 由于我是RHadoop的新手,因此遇到了这个问题。 可以,任何人都可以帮我解决这个问题,我会非常感谢。

回答

1

该错误是由于代码中未设置HADOOP_STREAMING环境变量而引起的。您应该指定完整路径以及jar文件名称。下面的R代码似乎对我很好。

R代码里面(我使用Hadoop 2.4.0在Ubuntu的)

Sys.setenv("HADOOP_CMD"="/usr/local/hadoop/bin/hadoop") 
Sys.setenv("HADOOP_STREAMING"="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.0.jar") 

library(rJava) 
library(rhdfs) 
# Initialise 
hdfs.init() 
library(rmr2) 

a <- to.dfs(seq(from=1, to=500, by=3), output="/user/hduser/num") 
b <- mapreduce(input=a, map=function(k,v){keyval(v,v*v)}) 

希望这有助于。

+0

感谢您的帮助! – 2015-03-25 10:39:34

相关问题