2017-06-23 17 views
0

如何通过一个本地文件作为输入火花提交,我已经试过类似如下:如何通过一个本地文件作为输入火花提交

spark-submit --jars /home/hduser/.ivy2/cache/com.typesafe/config/bundles/config-1.3.1.jar --class "retail.DataValidator" --master local[2] --executor-memory 2g --total-executor-cores 2 sample-spark-180417_2.11-1.0.jar file:///home/hduser/Downloads/Big_Data_Backup/ dev file:///home/hduser/spark-training/workspace/demos/output/destination file:///home/hduser/spark-training/workspace/demos/output/extrasrc file:///home/hduser/spark-training/workspace/demos/output/extradest 

错误:

Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: file:/home/inputfile , expected: hdfs://hadoop:54310 

也尝试过没有前缀“file://”的路径,但没有运气。它在月食中工作良好。

谢谢。

回答

0

如果您希望每个执行者都可以访问这些文件,则需要使用选项files。例如:

spark-submit --files file1,file2,file3 
+0

我想这火花提交--jars /home/hduser/.ivy2/cache/com.typesafe/config/bundles/config-1.3.1.jar --class “retail.DataValidator” --master local [2] --executor -memory 2g --total-executor-cores 2 --files/inputfileone,/ outputfilepathone,/ outputfilepathtwo,/ outputfilepaththree sample-spark-180417_2.11-1.0.jar 但我得到了此错误错误SparkContext:初始化SparkContext时出错。 org.apache.spark.SparkException:添加的文件文件:文件是一个目录,并且递归未打开。 – Vignesh

+0

当我按这个顺序尝试spark-submit --jars /home/hduser/.ivy2/cache/com.typesafe/config/bundles/config-1.3.1.jar --class“retail.DataValidator”--master local [2] --executor-memory 2g --total-executor-cores 2 sample-spark-180417_2.11-1.0.jar --files/inputfileone,/ outputfilepathone,/ outputfilepathtwo,/ outputf ilepaththree我在线程中得到异常“主“java.lang.ArrayIndexOutOfBoundsException:2 – Vignesh