2014-04-03 57 views
0

我以前的问题被张贴在这里:的Hadoop:java.lang.Exception的:java.lang.NoClassDefFoundError:组织/阿帕奇/的Xerces /解析器/ AbstractSAXParser

Hadoop: java.lang.Exception: java.lang.RuntimeException: Error in configuring object

然后我跟着建议和包中的所有将jar文件合并成一个并解决第一个问题。 请参考前一篇文章的源代码。提前致谢。 但新的问题当属:

14/04/03 13:47:39 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/04/03 13:47:40 WARN snappy.LoadSnappy: Snappy native library is available 
14/04/03 13:47:40 INFO snappy.LoadSnappy: Snappy native library loaded 
14/04/03 13:47:40 INFO mapred.FileInputFormat: Total input paths to process : 1 
14/04/03 13:47:40 INFO mapred.JobClient: Running job: job_local1748858601_0001 
14/04/03 13:47:40 INFO mapred.LocalJobRunner: Waiting for map tasks 
14/04/03 13:47:40 INFO mapred.LocalJobRunner: Starting task: attempt_local1748858601_0001_m_000000_0 
14/04/03 13:47:40 INFO util.ProcessTree: setsid exited with exit code 0 
14/04/03 13:47:40 INFO mapred.Task: Using ResourceCalculatorPlugin : [email protected] 
14/04/03 13:47:40 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/project/input1/url.txt:0+68 
14/04/03 13:47:40 INFO mapred.MapTask: numReduceTasks: 1 
14/04/03 13:47:40 INFO mapred.MapTask: io.sort.mb = 100 
14/04/03 13:47:40 INFO mapred.MapTask: data buffer = 79691776/99614720 
14/04/03 13:47:40 INFO mapred.MapTask: record buffer = 262144/327680 
Prepare to get into webpage 
14/04/03 13:47:41 INFO mapred.JobClient: map 0% reduce 0% 
14/04/03 13:47:43 INFO mapred.LocalJobRunner: Map task executor complete. 
14/04/03 13:47:43 WARN mapred.LocalJobRunner: job_local1748858601_0001 
java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/xerces/parsers/AbstractSAXParser 
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) 
Caused by: java.lang.NoClassDefFoundError: org/apache/xerces/parsers/AbstractSAXParser 
    at java.lang.ClassLoader.defineClass1(Native Method) 
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643) 
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) 
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) 
    at java.net.URLClassLoader.access$000(URLClassLoader.java:73) 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:212) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323) 
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268) 
    at de.l3s.boilerpipe.sax.BoilerpipeSAXInput.getTextDocument(BoilerpipeSAXInput.java:51) 
    at de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:69) 
    at de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:87) 
    at webPageToTxt.WebPageToTxt.webPageString(WebPageToTxt.java:82) 
    at webPageToTxt.WebPageToTxt.multiWebPageString(WebPageToTxt.java:126) 
    at webPageToTxt.WebPageToTxt.webPageToTxt(WebPageToTxt.java:40) 
    at webPageToTxt.WebPageToTxtMapper.map(WebPageToTxtMapper.java:27) 
    at webPageToTxt.WebPageToTxtMapper.map(WebPageToTxtMapper.java:1) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) 
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:701) 
Caused by: java.lang.ClassNotFoundException: org.apache.xerces.parsers.AbstractSAXParser 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323) 
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268) 
    ... 29 more 
14/04/03 13:47:44 INFO mapred.JobClient: Job complete: job_local1748858601_0001 
14/04/03 13:47:44 INFO mapred.JobClient: Counters: 0 
14/04/03 13:47:44 INFO mapred.JobClient: Job Failed: NA 
Exception in thread "main" java.io.IOException: Job failed! 
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) 
    at webPageToTxt.ConfMain.run(ConfMain.java:33) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) 
    at webPageToTxt.ConfMain.main(ConfMain.java:40) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:622) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160) 

回答

0

您需要添加您正在使用的罐子外,你的驱动程序&地图减少代码驻留,让他们可在运行时映射器的所有罐子。

我检查了您提供的链接。尽管将其他类作为Map Reduce jar的一部分进行打包工作。这并不总是可能的。正如你所看到的,你在这里使用xerces,为此你需要包含xerces-impl.jar。

更好的方法是将这些jar添加到DistributedCache。

DistributedCache.addArchiveToClassPath(new Path("HDFS Path"), job);

你可以保持罐子在HDFS。所以解决方案是添加xerces jar。

相关问题