2014-04-07 25 views
0

当我尝试在解析MultilineJSONFormat数据时运行Map任务时,出现以下错误。我拥有所有必需的JAR并且程序正在编译时没有任何错误。 的输入如下:使用MultilineJSON格式的Hadoop 2.2中的Mapper任务错误

[ 
     { 
      "SeasonTicket": false, 
      "name": "Vinson Foreman", 
      "gender": "male", 
      "age": 50, 
      "email": "[email protected]", 
      "annualSalary": "$98,501.00", 
      "id": 0 
     }, 
     { 
      "SeasonTicket": true, 
      "name": "Genevieve Compton", 
      "gender": "female", 
      "age": 28, 
      "email": "[email protected]", 
      "annualSalary": "$46,881.00", 
      "id": 1 
     } 
] 

我试图让性别的计数:男性或女性的属性。 请参阅下面的代码:

映射器类:

public class DemoMapper extends Mapper<LongWritable, Text, Text, Text> { 
    private Text k = new Text(); 
     private Text v ; 

    @Override 
    protected void map(LongWritable key , Text value, Context context) 
      throws IOException, InterruptedException { 
     String line = value.toString(); 
     StringTokenizer itr = new StringTokenizer(line); 
      while (itr.hasMoreTokens()) { 
        //String token = itr.nextToken(); 
       k.set((itr.nextToken())); 
        context.write(k, v); 
      } 
    } 
} 

减速机类:

public class DemoReducer extends Reducer<Text, IntWritable, Text, IntWritable> 

{ 
    //@Override 
    public void reduce(Text key, Iterable <IntWritable> values, 
      Context context) throws IOException, InterruptedException { 

       int sum = 0; 
       while ((Iterable) values.iterator() != null) { 

        IntWritable value = values.iterator().next(); 
         sum += value.get(); // process value*/ 
       } 

      context.write(key, new IntWritable(sum)); 
      } 
} 

主要类:

public final class ExampleJob extends Configured implements Tool { 

    public static void main(final String[] args) throws Exception { 
     int res = ToolRunner.run(new Configuration(), new ExampleJob(), args); 
     System.exit(res); 
    } 

    /** 
    * The MapReduce driver - setup and launch the job. 
    * 
    * @param args the command-line arguments 
    * @return the process exit code 
    * @throws Exception if something goes wrong 
    */ 
    public int run(final String[] args) throws Exception { 

     Configuration conf = super.getConf(); 

     // writeInput(conf, new Path(input)); 

     Job job = new Job(conf); 
     job.setJarByClass(ExampleJob.class); 
     job.setOutputKeyClass(LongWritable.class); 
     job.setOutputValueClass(Text.class); 
     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(LongWritable.class); 

     job.setMapperClass(DemoMapper.class); 
     job.setReducerClass(DemoReducer.class); 
     job.setCombinerClass(DemoReducer.class); 
     // job.setNumReduceTasks(1); 


     Path path = new Path("result15"); 

     FileInputFormat.addInputPaths(job, "testfolder"); 
     FileOutputFormat.setOutputPath(job, path); 

     // use the JSON input format 
     job.setInputFormatClass(MultiLineJsonInputFormat.class); 

     // specify the JSON attribute name which is used to determine which 
     // JSON elements are supplied to the mapper 
     MultiLineJsonInputFormat.setInputJsonMember(job,"gender"); 

     if (job.waitForCompletion(true)) { 
      return 0; 
     } 
     return 1; 
     } 
} 

堆栈跟踪:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". 
SLF4J: Defaulting to no-operation (NOP) logger implementation 
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 
2014-04-06 18:30:33,662 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
2014-04-06 18:30:33,878 INFO [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId= 
2014-04-06 18:30:34,352 WARN [main] mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(258)) - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 
2014-04-06 18:30:34,379 INFO [main] input.FileInputFormat (FileInputFormat.java:listStatus(287)) - Total input paths to process : 1 
2014-04-06 18:30:34,459 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(394)) - number of splits:1 
2014-04-06 18:30:34,482 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - user.name is deprecated. Instead, use mapreduce.job.user.name 
2014-04-06 18:30:34,484 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 
2014-04-06 18:30:34,485 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 
2014-04-06 18:30:34,486 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class 
2014-04-06 18:30:34,487 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 
2014-04-06 18:30:34,487 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 
2014-04-06 18:30:34,488 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 
2014-04-06 18:30:34,488 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
2014-04-06 18:30:34,489 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 
2014-04-06 18:30:34,489 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
2014-04-06 18:30:34,490 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 
2014-04-06 18:30:34,495 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 
2014-04-06 18:30:34,496 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 
2014-04-06 18:30:34,881 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(477)) - Submitting tokens for job: job_local1580542852_0001 
2014-04-06 18:30:35,005 WARN [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/staging/riak1580542852/.staging/job_local1580542852_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 
2014-04-06 18:30:35,006 WARN [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/staging/riak1580542852/.staging/job_local1580542852_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 
2014-04-06 18:30:35,412 WARN [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/local/localRunner/riak/job_local1580542852_0001/job_local1580542852_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 
2014-04-06 18:30:35,413 WARN [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/local/localRunner/riak/job_local1580542852_0001/job_local1580542852_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 
2014-04-06 18:30:35,437 INFO [main] mapreduce.Job (Job.java:submit(1272)) - The url to track the job: http://localhost:8080/ 
2014-04-06 18:30:35,439 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1317)) - Running job: job_local1580542852_0001 
2014-04-06 18:30:35,441 INFO [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(323)) - OutputCommitter set in config null 
2014-04-06 18:30:35,453 INFO [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(341)) - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 
2014-04-06 18:30:35,543 INFO [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(389)) - Waiting for map tasks 
2014-04-06 18:30:35,545 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:run(216)) - Starting task: attempt_local1580542852_0001_m_000000_0 
2014-04-06 18:30:35,689 INFO [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:initialize(581)) - Using ResourceCalculatorProcessTree : [ ] 
2014-04-06 18:30:35,700 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:runNewMapper(732)) - Processing split: file:/home/riak/workspace/Hadooprun/testfolder/file1.json:0+7703579 
2014-04-06 18:30:35,733 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:createSortingCollector(387)) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
2014-04-06 18:30:36,585 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1338)) - Job job_local1580542852_0001 running in uber mode : false 
2014-04-06 18:30:36,588 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1345)) - map 0% reduce 0% 
2014-04-06 18:30:36,593 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:setEquator(1183)) - (EQUATOR) 0 kvi 26214396(104857584) 
2014-04-06 18:30:36,593 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(975)) - mapreduce.task.io.sort.mb: 100 
2014-04-06 18:30:36,594 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(976)) - soft limit at 83886080 
2014-04-06 18:30:36,594 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(977)) - bufstart = 0; bufvoid = 104857600 
2014-04-06 18:30:36,594 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(978)) - kvstart = 26214396; length = 6553600 
2014-04-06 18:30:36,622 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1440)) - Starting flush of map output 
2014-04-06 18:30:36,649 INFO [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(397)) - Map task executor complete. 
2014-04-06 18:30:36,652 WARN [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(482)) - job_local1580542852_0001 
java.lang.Exception: java.lang.NullPointerException 
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403) 
Caused by: java.lang.NullPointerException 
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1054) 
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) 
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) 
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) 
    at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:25) 
    at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:1) 
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) 
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:744) 
2014-04-06 18:30:37,594 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1358)) - Job job_local1580542852_0001 failed with state FAILED due to: NA 
2014-04-06 18:30:37,605 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1363)) - Counters: 0 

回答

1

看堆栈跟踪

Caused by: java.lang.NullPointerException 
... 
at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:25) 

在映射成员“文本V”从未初始化,但得到的书面上下文。

private Text v ; 
... 
context.write(k, v); 

您需要初始化“V”新文本()

1

除了aasoj的答案,我想提出一个观点在这里,从映射器输出将被作为输入提供给减速器。因此,在reducer类中,输入键值类型是'Text'和'IntWritable',其中,映射器类的输出键值是'Text'和'Text'。

试图改变减速一样映射器的输出类型,它看起来像下面的输入键值:

公共类DemoMapper扩展映射

公共类DemoReducer从延长减速

除了高于一切寻找我。

+0

在我上面的答案映射器和减速器的参数丢失。 – M06494h

+0

public class DemoMapper扩展Mapper { public class DemoReducer extends Reducer M06494h