减少的Hadoop功能不起作用

我学习的Hadoop。我用Java编写了简单的程序。程序必须对单词进行计数（并且创建带有单词和每个单词出现次数的文件），但程序仅创建一个包含所有单词的文件，并且在每个单词附近编号为“1”。它看起来像：减少的Hadoop功能不起作用

RMD 1
RMD 1
RMD 1
RMD 1
rmdaxsxgb 1

但我想：

RMD 4
rmdaxsxgb 1

我的理解，只能地图功能。（我试图评论减少功能，并有相同的结果）。

我的代码（这是一个典型的例子，MapReduce的程序，它可以在互联网或约Hadoop的书很容易瑶池）：

public class WordCount { 

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { 
    private final static IntWritable one = new IntWritable(1); 
    private Text word = new Text(); 

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 
     String line = value.toString(); 
     StringTokenizer tokenizer = new StringTokenizer(line); 
     while (tokenizer.hasMoreTokens()) { 
      word.set(tokenizer.nextToken()); 
      context.write(word, one); 
     } 
    } 
} 

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { 

    public void reduce(Text key, Iterator<IntWritable> values, Context context) 
     throws IOException, InterruptedException { 
     int sum = 0; 
     while (values.hasNext()) { 
      sum += values.next().get(); 
     } 
     context.write(key, new IntWritable(sum)); 
    } 
} 


public static void main(String[] args) throws Exception { 
     Configuration conf = new Configuration(); 

     Job job = new Job(conf, "wordcount"); 
     job.setJarByClass(WordCount.class); 

     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(IntWritable.class); 

     job.setMapperClass(Map.class); 
     job.setReducerClass(Reduce.class); 

     job.setInputFormatClass(TextInputFormat.class); 
     job.setOutputFormatClass(TextOutputFormat.class); 

     FileInputFormat.addInputPath(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     job.waitForCompletion(true); 
    } }

我使用亚马逊网络服务的Hadoop，不明白为什么它不能正常工作。

来源

2015-05-01 Ales

这可能是因为这些API的混搭。 hadoop有两个API，旧版本是mapred，最新版本是mapreduce。

在最新的API中，reducer处理值为Iterable，与Iterator（旧API）的值相比，如代码中所示。

尝试 -

来源

2015-05-01 13:56:26

感谢，我试了一下，帮，但应该有'可迭代 values'，你有一个错字。 – Ales

@Ales：谢谢，编辑 –

看起来没有减速在Hadoop集群上运行。您可以通过三种方式进行设置。你可以在你的mapred-site.xml中设置它。设置该属性一样

<property> 
<name>mapred.reduce.tasks</name> 
<value>1</value> 
</property>

或通过像

-D mapred.reduce.tasks=1

在命令行设置，或通过在主类中定义它

job.setNumReduceTasks(1);

要永久设定所有的工作，你应该在你的mapred-site.xml中设置属性。

来源

2015-05-01 13:11:26 salmanbw

减少的Hadoop功能不起作用

回答

相关问题