2014-04-26 21 views
0

表(声明:我很新的Hadoop和Java)“透视”使用Hadoop

作为输入,有一个简单的键值结构的表格:

key1 value1 
key2 value2 
key3 value3 
key2 value4 
key1 value5 
key1 value6 

由于输出,我想收集属于特定的键,每个键的所有值,所以是这样的:

key1, value1 value5 value6 
key2, value2 value4 
key3, value3 

这里是我的映射:

public class WordMapper extends Mapper<Object, Text, Text, Text> { 

@Override 
public void map(Object key, Text value, 
    Context context) throws IOException, InterruptedException { 

    String[] fields = value.toString().split("\\t", -1); 
    for (int i = 0; i < fields.length; ++i) { 
     if ("".equals(fields[i])) fields[i] = null; 
    } 
    List<String> fields_list = Arrays.asList(fields); 
    Text textKey = new Text(fields_list.get(0)); 
    Text textValue = new Text(fields_list.get(1)); 
    context.write(textKey,textValue); 
    } 
} 

这里是减速机:

public class SumReducer extends Reducer<Text, TextArrayWritable, Text, TextArrayWritable> { 
    private TextArrayWritable valuesTotal = new TextArrayWritable(); 

    public void reduce(Text key, Iterable<Text> values, Context context) 
       throws IOException, InterruptedException { 
     ArrayList<Text> values_list = new ArrayList<Text>(); 

     for (Text value : values) { 
      values_list.add(value); 
    } 
     Text[] values_arr = new Text[values_list.size()]; 
     values_arr = values_list.toArray(values_arr); 

     valuesTotal.setFields(values_arr); 
     context.write(key, valuesTotal); 
} 
} 

出于某种原因,我无法从我的程序得到任何输出。它只是终止,不在输出文件夹中。我这里有什么问题?

(I使用的Hadoop 2.2.0和Eclipse + hadoop的插件。例如字计数没有问题运行。)

+0

请问你TextArrayWritable类是什么样子? – Willmore

+0

我解决了问题并摆脱了课堂,因为这里并不是真的需要 – Timofey

回答

1

问题解决了。在启用日志记录之后,很明显,我的数据包含第4列中缺少值的行,所以我添加了空值检查if (fields[4] != null),它工作正常。此外,我摆脱阵列列出TextArrayWritable自定义类的皈依和使用

映射:

@Override 
public void map(Object key, Text value, 
    Context context) throws IOException, InterruptedException { 

    String[] fields = value.toString().split("\\t", -1); 
    for (int i = 0; i < fields.length; ++i) { 
     if ("".equals(fields[i])) fields[i] = null; 
    } 
    if (fields[4] != null) { 
    System.out.println(fields[0]); 
    System.out.println(fields[4]); 
    context.write(new Text(fields[0]),new Text(fields[4])); 
    } 
    } 
} 

减速机:

public class SongsReducer extends Reducer<Text, Text, Text, Text> { 
    public void reduce(Text key, Iterable<Text> values, Context context) 
       throws IOException, InterruptedException { 
     boolean first = true; 
     StringBuilder songs = new StringBuilder();; 
     for (Text val : values){ 
       if (!first) 
       songs.append(","); 
       first=false; 
       songs.append(val.toString()); 
      } 

     context.write(key, new Text(songs.toString())); 
} 
}