2016-10-02 36 views
0

我是Hadoop的新手。我正在尝试使用下面的代码读取HDFS上的现有文件。配置似乎文件和文件路径也是正确的。 -Hadoop Map Reduce - 读取HDFS文件 - FileAlreadyExists错误

public static class Map extends Mapper<LongWritable, Text, Text, Text> { 

    private static Text f1, f2, hdfsfilepath; 
    private static HashMap<String, ArrayList<String>> friendsData = new HashMap<>(); 

    public void setup(Context context) throws IOException { 
     Configuration conf = context.getConfiguration(); 
     Path path = new Path("hdfs://cshadoop1" + conf.get("hdfsfilepath")); 
     FileSystem fs = FileSystem.get(path.toUri(), conf); 
     if (fs.exists(path)) { 
     BufferedReader br = new BufferedReader(
      new InputStreamReader(fs.open(path))); 
     String line; 
     line = br.readLine(); 
     while (line != null) { 
      StringTokenizer str = new StringTokenizer(line, ","); 
      String friend = str.nextToken(); 
      ArrayList<String> friendDetails = new ArrayList<>(); 
      while (str.hasMoreTokens()) { 
      friendDetails.add(str.nextToken()); 
      } 
      friendsData.put(friend, friendDetails); 
     } 
     } 
    } 

    public void map(LongWritable key, Text value, Context context) 
     throws IOException, InterruptedException { 
     for (String k : friendsData.keySet()) { 
     context.write(new Text(k), new Text(friendsData.get(k).toString())); 
     } 
    } 
    } 

我得到下面的异常,当我运行的代码 -

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://cshadoop1/socNetData/userdata/userdata.txt already exists 
     at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146) 
     at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) 
     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) 

我只是想读取现有的文件。任何想法,我在这里失踪?感谢任何帮助。

回答

2

异常告诉你,你的输出目录已经存在,但它不应该。删除它或更改其名称。

此外,输出目录'userdata.txt'的名称看起来像文件的名称。因此,请检查您是否在输入/输出目录中发生错误。