2012-07-06 50 views
4

我已经安装了cloudera cdh4 release我试图在此上运行mapreduce作业。我收到以下错误 - >cdh4 hadoop-hbase PriviledgedActionException as:hdfs(auth:SIMPLE)cause:java.io.FileNotFoundException

2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration. 
2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session 
2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master 
2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry after sleeping 2000 ms 
2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000 
2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian 
2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001 
2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar 
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar 
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736) 
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) 
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) 
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246) 
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284) 
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355) 
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226) 
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) 
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223) 
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244) 
    at 

我能够运行Hadoop的MapReduce的例子-2.0.0-cdh4.0.0.jar给定的样本程序。 但是当我的工作成功提交给jobtracker时,我收到了这个错误。看起来它试图再次访问本地文件系统(尽管我已经为分布式缓存中的作业执行设置了所有必需的库,但仍尝试访问本地目录)。这个问题与用户权限有关吗?

I) Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/节目 -

Found 8 items 
drwxr-xr-x - hbase hbase    0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/> 
drwxr-xr-x - hdfs supergroup   0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/> 
drwxr-xr-x - hdfs supergroup   0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/> 
drwxr-xr-x - hdfs supergroup   0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/> 
drwxr-xr-x - hdfs supergroup   0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/> 
drwxrwxrwt - hdfs supergroup   0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/> 
drwxr-xr-x - hdfs supergroup   0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/> 

II) ---没有结果用于以下----

[email protected]:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> 
[email protected]:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/> 

相反,我觉得跟我们使用新的API - >
fs.defaultFS - > hdfs:// Cloudera:8020我已经正确设置了

虽然“fs.default.name”我有条目Hadoop集群0.20.2(非Cloudera的集群)

[email protected]:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> 
./core-default.xml: <name>fs.default.name</name><br/> 
./core-site.xml: <name>fs.default.name</name><br/> 

我觉得CDH4默认配置应该添加在各自的目录中该条目。 (如果它的错误)。

命令我使用运行我progrmme -

[email protected]:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName

+0

你可以发表你的作业提交的命令行,或引用任何代码这个文件 - 文件是否存在于本地系统上? – 2012-07-06 14:57:12

+0

嗨克里斯,感谢您的回复我已更新的问题,请参阅上文。 – Yogesh 2012-07-09 08:01:24

+1

job_local_0001表示mapred-site.xml设置不正确。并且应该在使用New Configuration()时使用。在那里设置。 http://hbase.apache.org/book.html#trouble.mapreduce.local – Yogesh 2012-07-13 10:32:37

回答

2

调试程序:尝试运行简单的Hadoop shell命令。

Hadoop的FS -ls/

如果这显示了HDFS文件,那么你的配置是正确的。如果不是,则配置丢失。发生这种情况时,像-ls这样的hadoop shell命令将引用本地文件系统而不是Hadoop文件系统。 如果Hadoop使用CMS(Cloudera管理器)启动,则会发生这种情况。它没有明确地将配置存储在conf目录中。

hadoop的FS -ls HDFS://主机:

检查hadoop的文件系统是由下面的命令(改变端口)显示8020/

如果显示本地文件系统时您提交路径为/,那么您应该在配置目录中设置配置文件hdfs-site.xmlmapred-site.xml。另外hdfs-site.xml应该有指向hdfs://host:port/的条目fs.default.name。在我的情况下,目录是/etc/hadoop/conf

请参见:http://hadoop.apache.org/common/docs/r0.20.2/core-default.html

看,如果这个解决您的问题。

+0

Ashish请在父母问题中找到您问题的结果。 – Yogesh 2012-07-09 07:10:55

+0

这就是我如何创建一个配置对象。组态。 conf = new Configuration(false); conf.addResource(新路径(ConfigReader.HBASE_ROOT_DIRECTORY +“/conf/core-site.xml”)); conf.addResource(新路径(ConfigReader.HBASE_ROOT_DIRECTORY +“/conf/hdfs-site.xml”)); conf.addResource(新路径(ConfigReader.HADOOP_ROOT_DIRECTORY +“/conf/core-site.xml”)); conf.addResource(新路径(ConfigReader.HADOOP_ROOT_DIRECTORY +“/conf/hdfs-site.xml”)); \t \t conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY +“/conf/hbase-site.xml”));
这是正确的方法吗?
Yogesh 2012-07-09 11:32:25

+0

我认为mappred设置(mapred-site.xml文件丢失)没有正确设置。这就是为什么默认情况下它试图在本地运行作业。要么我们需要配置Yarn或需要正确设置配置,以便mrf1作业将在jobtracker上运行 – Yogesh 2012-07-11 15:38:48

4

即使我在运行MR作业时也在2.0.0-cdh4.1.3中分阶段出现同样的问题。加入 财产mapred.site.xml

<property> 
<name>mapreduce.framework.name</name> 
<value>yarn</value> 
</property> 

对于运行蜂巢作业后

export HIVE_USER=yarn