2016-04-27 15 views
0

我已经下载了Spark-1.6.1版本。它已经为我的hadoop 2.6版本而构建,所以我只需要解压缩它,并且永远不要混淆构建的工具。在我的核心site.xml文件我写无法在Spark中使用HDFS中的文件

<configuration> 
<property> 
    <name>hadoop.tmp.dir</name> 
    <value>/app/hadoop/tmp</value> 
</property> 
<property> 
    <name>fs.default.name</name> 
    <value>hdfs://localhost:54310</value> 
</property> 
</configuration> 

然后,我上传了一个名为LICENSE txt文件。

当我在斯卡拉命令行写

-val textFile = sc.textFile("hdfs://localhost:54310/LICENSE") 
-textFile.count 

我得到:

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:54310/LICENSE 
    at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) 
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) 
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) 
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) 
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) 
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) 
    at scala.Option.getOrElse(Option.scala:120) 
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) 
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) 
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) 
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) 
    at scala.Option.getOrElse(Option.scala:120) 
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) 
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) 
    at org.apache.spark.rdd.RDD.count(RDD.scala:1157) 
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30) 
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35) 
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:37) 
    at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39) 
    at $iwC$$iwC$$iwC$$iwC.<init>(<console>:41) 
    at $iwC$$iwC$$iwC.<init>(<console>:43) 
    at $iwC$$iwC.<init>(<console>:45) 
    at $iwC.<init>(<console>:47) 
    at <init>(<console>:49) 
    at .<init>(<console>:53) 
    at .<clinit>(<console>) 
    at .<init>(<console>:7) 
    at .<clinit>(<console>) 
    at $print(<console>) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) 
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) 
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) 
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) 
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) 
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) 
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) 
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) 
    at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) 
    at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) 
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) 
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) 
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) 
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) 
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) 
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) 
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) 
    at org.apache.spark.repl.Main$.main(Main.scala:31) 
    at org.apache.spark.repl.Main.main(Main.scala) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) 
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) 
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) 
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) 
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

我应该手动内置火花从beggining?

+1

你看到了什么,当你做 “HDFS DFS -ls HDFS://本地主机:54310/LICENSE” – tesnik03

+0

hduser @乔治W25xHNx:〜$ HDFS DFS -ls HDFS://本地主机:54310/LICENSE 16/04/27 17:33:35 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库......在适用的情况下使用builtin-java类 ls:'hdfs:// localhost:54310/LICENSE':No such文件或目录 – grtheod

+0

然而:with:简单hdfs dfs -ls我得到:hduser @ george-W25xHNx:〜$ hdfs dfs -ls 16/04/27 17:35:46 WARN util.NativeCodeLoader:无法加载native-hadoop库为您的平台...使用内置的Java类适用 找到5项 -rw-r - r-- 1 hduser supergroup 17352 2016-04-27 16:07 LICENSE – grtheod

回答

0

由于yyny说问题出在我使用的路径上。它必须是"hdfs://localhost:54310/user/hduser/LICENSE"

相关问题