2017-01-31 42 views
1

我想在Spark上执行程序。我有一个主节点和两个从节点的集群。我在执行过程中收到以下错误。FileNotFoundException执行spark任务

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 4.0 failed 4 times, most recent failure: Lost task 3.3 in stage 4.0 (TID 44, hadoopslave3): java.lang.RuntimeException: java.io.FileNotFoundException: File /home/ubuntu/hadoop/hadoop-te/dl4j/1485860107978_-4ccc8c8/0/data/dataset_4-4ccc8c8_68.bin does not exist 
Driver stacktrace is as follows: 
Driver stacktrace: 
at og.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) 
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) 
at scala.Option.foreach(Option.scala:236) 
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) 
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
17/01/31 10:56:08 INFO scheduler.TaskSetManager: Lost task 1.3 in stage 4.0 (TID 45) on executor hadoopslave3: java.lang.RuntimeException (java.io.FileNotFoundException: File /home/ubuntu/hadoop/hadoop-te/dl4j/1485860107978_-4ccc8c8/0/data/dataset_2-4ccc8c8_77.bin does not exist) [duplicate 3] 

但是,我可以看到在HDFS上创建的所有数据集对象(.bin文件)。任何建议?

+0

这个'/ home/ubuntu/hadoop/hadoop-te/dl4j/1485860107978_-4ccc8c8/0/data/dataset_4-4ccc8c8_68.bin'看起来像一个本地文件。 – franklinsijo

+0

发布火花程序 –

+0

@franklinsijo:这是{hadoop.tmp.dir} – usm123

回答

0

既然你有一个设置了“两个从节点”的集群:你是否也设置了Hadoop文件系统?如果不是,那就是你的问题。

当您使用非本地群集时,链接到的示例将使用Hadoop将数据集的引用传输到该数据集。这个例子的行为已经在0.8.0版本中变得更可预测(使用big error message)。