2014-01-30 8 views
0

我正在使用CDH4(4.5)中的MRv1并面临CompositeInputFormat的问题。无论我尝试加入多少个输入都无关紧要。为了简单起见,这里的例子只有一个输入:不能在Hadoop中使用CompositeInputFormat,抛出异常表达式为空

Configuration conf = new Configuration(); 

    Job job = new Job(conf, "Blah"); 
    job.setJarByClass(Blah.class); 

    job.setMapperClass(Blah.BlahMapper.class); 
    job.setReducerClass(Blah.BlahReducer.class); 

    job.setMapOutputKeyClass(LongWritable.class); 
    job.setMapOutputValueClass(BlahElement.class); 
    job.setOutputKeyClass(LongWritable.class); 
    job.setOutputValueClass(BlahElement.class); 

    job.setInputFormatClass(CompositeInputFormat.class); 
    String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, "/someinput"); 
    System.out.println(joinStatement); 
    conf.set("mapred.join.expr", joinStatement); 
    job.setOutputFormatClass(SequenceFileOutputFormat.class); 

    FileOutputFormat.setOutputPath(job, new Path(newoutput)); 

    return job.waitForCompletion(true) ? 0 : 1; 

这里的输出+堆栈跟踪:

SLF4J: Class path contains multiple SLF4J bindings. 
SLF4J: Found binding in [jar:file:/hadoop2/share/hadoop/mapreduce1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: Found binding in [jar:file:/hadoop2/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
14/01/31 03:27:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
inner(tbl(org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat,"/someinput")) 
14/01/31 03:27:48 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
14/01/31 03:27:51 INFO mapred.JobClient: Cleaning up the staging area hdfs://archangel-desktop:54310/tmp/hadoop/mapred/staging/hadoop/.staging/job_201401302213_0013 
14/01/31 03:27:51 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: Expression is null 
Exception in thread "main" java.io.IOException: Expression is null 
    at org.apache.hadoop.mapreduce.lib.join.Parser.parse(Parser.java:542) 
    at org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat.setFormat(CompositeInputFormat.java:85) 
    at org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:127) 
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1079) 
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1096) 
    at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177) 
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:995) 
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:948) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:948) 
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) 
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) 
    at com.nileshc.graphfu.pagerank.BlockMatVec.run(BlockMatVec.java:79) 
    at com.nileshc.graphfu.Main.main(Main.java:21) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 

任何以往任何时候都面临过这样的?任何想法如何解决它?

回答

0

我的不好。

conf.set("mapred.join.expr", joinStatement); 

上面的应该是:

job.getConfiguration().set("mapreduce.join.expr", joinStatement); 

和:

String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, "/someinput"); 

^^这应该是:

String joinStatement = CompositeInputFormat.compose("inner", SequenceFileInputFormat.class, new Path("/someinput")); 

第一个变化是什么让所有的差异。

0

在上面的代码,

conf.set("mapred.join.expr", joinStatement); 

上面的行被创建作业obeject之后是编码。所以很明显,Job对象并不知道这个配置!!!!!!

请参见下面的修改后的代码: -

Configuration conf = new Configuration(); 
conf.set("mapred.join.expr", joinStatement); 
Job job = new Job(conf, "Blah"); 
job.setJarByClass(Blah.class); 
. 
. 
. 
. 
. 

以下是周围的其他方法: -

job.getConfiguration().set("mapreduce.join.expr", joinStatement); 

使用上面的代码,而不是

conf.set("mapred.join.expr", joinStatement); 
+0

究竟。看到我上面发布的答案。 – Nilesh