2013-01-15 34 views
6

我看到许多与ClassNotFoundExceptions,“无作业jar文件集”和Hadoop相关的问题。他们中的大多数指向在配置中缺少setJarByClass方法(使用JobConfJob)。我有点困惑,因为我击中了这个例外。这是我认为的一切有关(请让我知道如果我省略了任何东西):与MapClass相关的Hadoop ClassNotFoundException

echo $CLASS_PATH 
/root/javajars/mysql-connector-java-5.1.22/mysql-connector-java-5.1.22-bin.jar:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u5.jar:. 

代码(大多是略)

import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.Mapper; 
import org.apache.hadoop.mapreduce.Reducer; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.conf.Configured; 
import org.apache.hadoop.util.ToolRunner; 
import org.apache.hadoop.util.Tool; 
import org.apache.hadoop.util.GenericOptionsParser; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.io.IntWritable; 

import java.io.IOException; 
import java.util.Iterator; 
import java.lang.System; 
import java.net.URL; 

import java.sql.Connection; 
import java.sql.DriverManager; 
import java.sql.SQLException; 
import java.sql.Statement; 
import java.sql.ResultSet; 

public class QueryTable extends Configured implements Tool { 

    public static class MapClass extends Mapper<Object, Text, Text, IntWritable>{ 

    public void map(Object key, Text value, Context context) 
      throws IOException, InterruptedException { 
      ... 
     } 
    } 

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>{ 
     private IntWritable result = new IntWritable(); 

     public void reduce (Text key, Iterable<IntWritable> values, 
          Context context) throws IOException, InterruptedException { 
      ... 
     } 
    } 

    public int run(String[] args) throws Exception { 
     //Configuration conf = getConf();                                                          
     Configuration conf = new Configuration(); 

     Job job = new Job(conf, "QueryTable"); 
     job.setJarByClass(QueryTable.class); 

     Path in = new Path(args[0]); 
     Path out = new Path(args[1]); 
     FileInputFormat.setInputPaths(job, in); 
     //FileInputFormat.addInputPath(job, in);                                                         
     FileOutputFormat.setOutputPath(job, out); 

     job.setMapperClass(MapClass.class); 
     job.setCombinerClass(Reduce.class); // new                                                        
     job.setReducerClass(Reduce.class); 

     job.setInputFormatClass(TextInputFormat.class); 
     job.setOutputFormatClass(TextOutputFormat.class); 
     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(Text.class); 

     System.exit(job.waitForCompletion(true)?0:1); 
     return 0; 
    } 

    public static void main(String[] args) throws Exception { 
     int res = ToolRunner.run(new Configuration(), new QueryTable(), args); 
     System.exit(res); 
    } 
} 

我再编译,创建罐子,然后运行:

javac QueryTable.java -d QueryTable 
jar -cvf QueryTable.jar -C QueryTable/ . 
hadoop jar QueryTable.jar QueryTable input output 

这里是个例外:

13/01/14 17:09:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
**13/01/14 17:09:30 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).** 
13/01/14 17:09:30 INFO input.FileInputFormat: Total input paths to process : 1 
13/01/14 17:09:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/01/14 17:09:30 WARN snappy.LoadSnappy: Snappy native library not loaded 
13/01/14 17:09:31 INFO mapred.JobClient: Running job: job_201301081120_0045 
13/01/14 17:09:33 INFO mapred.JobClient: map 0% reduce 0% 
    13/01/14 17:09:39 INFO mapred.JobClient: Task Id : attempt_201301081120_0045_m_000000_0, Status : FAILED 
java.lang.RuntimeException: java.lang.ClassNotFoundException: QueryTable$MapClass 
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1004) 
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:217) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) 
    at org.apache.hadoop.mapred.Child$4.run(Child.java:266) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) 
    at org.apache.hadoop.mapred.Child.main(Child.java:260) 
Caused by: java.lang.ClassNotFoundException: QueryTable$MapClass 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
    at java.lang.ClassLoader.loadCl 

对不起,这个巨大的文字墙。我不明白为什么我收到关于没有工作jar文件集的警告。我在我的运行方法中设置它。此外,警告由JobClient发出,在我的代码中,我使用Job而不是JobClient。如果你有任何想法或反馈,我很感兴趣。谢谢你的时间!

编辑

内容罐子:

jar -tvf QueryTable.jar 
    0 Tue Jan 15 14:40:46 EST 2013 META-INF/ 
    68 Tue Jan 15 14:40:46 EST 2013 META-INF/MANIFEST.MF 
3091 Tue Jan 15 14:40:10 EST 2013 QueryTable.class 
3173 Tue Jan 15 14:40:10 EST 2013 QueryTable$MapClass.class 
1699 Tue Jan 15 14:40:10 EST 2013 QueryTable$Reduce.class 
+0

你可以在你的jar上做一个jar -tvf来显示它的内容(并粘贴回你的问题,而不是作为注释) –

回答

3

我能够在我的源代码顶部声明一个包来解决这个问题。

package com.foo.hadoop; 

然后我编译,创建了jar,并明确地调用了hadoop,并在包名前加上了前缀。

hadoop jar QueryTable.jar com.foo.hadoop.QueryTable input output 

我明白这是大多数人开始时会做的,尽管我认为它没有指定包仍然可以工作。这绝对是更好的做法,它让我可以继续。

+0

当我将jar编译为Runnable JAR文件时,我得到了同样的问题。我把它改成了正常的JAR,并用你的方法给出了完整的路径,包括它的包工作正常.. – himanshu

+0

对我不起作用,仍然有'ClassNotFoundException:com.foo.hadoop.SomeClass' – CDT

+0

你的jar创建命令的样子喜欢?如何运行“jar -tvf your_jar”? – cbrown