尝试在Hadoop中开始工作时发生错误

我一直在尝试对hadoop使用pagerank算法，并且在初始化作业时遇到了一些问题。尝试在Hadoop中开始工作时发生错误

当我尝试使用作业类，使初始化我对编译以下错误：在线程

异常“主要” java.lang.NoClassDefFoundError：组织/阿帕奇/公/在组织日志/的LogFactory 。.apache.hadoop.mapreduce.Job（Job.java:89）在Pagerank.main（Pagerank.java:244）

下面是代码：

Job job; 
job = new Job(); 
job.setJarByClass(Pagerank.class);  // In what class are our map/reduce functions for this job found? 
job.setMapperClass(PRMap.class);  // What is our map function for this job? 
job.setReducerClass(PRReduce.class); // What is our reduce function for this job? 

job.setOutputKeyClass(Text.class);    // What are the (hadoop.io compliant) datatype for our 
job.setOutputValueClass(Text.class);   // reducer output's key-value pairs? 
job.setInputFormatClass(TextInputFormat.class);  // How will the mapper distinguish (key value) record inputs? 
FileInputFormat.addInputPath(job, new Path(args[0])); // First command line argument 
FileOutputFormat.setOutputPath(job, new Path("temp0")); 
job.waitForCompletion(true);

当我尝试做使用JobConf Cla进行初始化ss我在使用的一些方法上得到了一个错误。

下面是代码：

 JobConf conf = new JobConf(Pagerank.class); 
    conf.setJobName("pagerank"); 

    conf.setOutputKeyClass(Text.class); 
    conf.setOutputValueClass(Text.class); 

    conf.setMapperClass(PRMap.class); 
    conf.setReducerClass(PRReduce.class); 

    conf.setInputFormat(TextInputFormat.class); 
    conf.setOutputFormat(TextOutputFormat.class); 

    FileInputFormat.setInputPaths(conf, new Path(args[0])); 
    FileOutputFormat.setOutputPath(conf, new Path(args[1])); 

    JobClient.runJob(conf);

根据该错误：

类JobConf方法setMapperClass不能被应用到给定的类型;

要求：？类扩展映射器

发现：类PRMap

原因：实际参数类PRMap不能转换到类扩展映射器通过方法调用转换

似乎我无法通过PRMap.class作为setMapperClass中的参数，尽管我写的PRMap类遵循Hadoop的Map函数标准

public static class PRMap extends Mapper<LongWritable, Text, Text, Text> 
{ ... }

对这两种方法有何建议？

来源

2013-02-08 user2052763

尝试将包含org.apache.commons.Logging.LogFactory jar的jar放入每台机器的HadoopHome的Lib目录中并重新启动群集。

或者您可以尝试使用libjars选项通过命令行添加jar。为：

hadoop jar myjar.jar package.classname -libjars mypath/common-loggings.jar

来源

2013-02-08 04:22:02

谢谢！这解决了这个特殊的问题。 – user2052763

貌似PRMap类扩展org.apache.hadoop.mapreduce.Mapper http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapreduce/Mapper.html和需要通过JobConf被传递的类应该是org.apache.hadoop.mapred.Mapper的子类。

要解决java.lang.NoClassDefFoundError的问题，请将commons-logging-x.x.x.jar添加到您的类路径中。

运行hadoop类路径以确认您是否看到显示的jar。

来源

2013-02-08 04:26:35

在您的主要方法中添加此行。

DistributedCache.addFileToClassPath(new Path("<Absolute Path>/common-loggings.jar"), conf);

来源

2013-02-08 05:52:15 shazin

这是因为映射器无法找到LogFactory，这是common-loggings.jar一部分。为此，必须让每个客户端映射器都可以访问它，通过将jar复制到所有机器或其他有效的方法是通过复制到分布式缓存。

$bin/hadoop fs -copyFromLocal mylib.jar /myapp/mylib.jar 
And accessing it from you code 
DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job);

更可以发现here

来源

2013-02-08 09:17:47 twid

尝试在Hadoop中开始工作时发生错误

回答

相关问题