2015-03-13 38 views
0

我在虚拟机(Ubuntu 12.04)上运行spark(1.2.1)独立群集。我可以成功运行例如als.py和pi.py的例子。但是我无法运行workcount.py示例,因为会发生连接错误。Spark独立模式:连接失败例外:

 bin/spark-submit --master spark://192.168.1.211:7077 /examples/src/main/python/wordcount.py ~/Documents/Spark_Examples/wordcount.py 

该错误消息如下:

15/03/13 22:26:02 INFO BlockManagerMasterActor: Registering block manager a12:45594 with 267.3 MB RAM, BlockManagerId(0, a12, 45594) 
    15/03/13 22:26:03 INFO Client: Retrying connect to server: a11/192.168.1.211:9000. Already tried 4 time(s). 
    ...... 
    Traceback (most recent call last): 
    File "/home/spark/spark/examples/src/main/python/wordcount.py", line 32, in <module> 
    .reduceByKey(add) 
    File "/home/spark/spark/lib/spark-assembly-1.2.1 hadoop1.0.4.jar/pyspark/rdd.py", line 1349, in reduceByKey 
    File "/home/spark/spark/lib/spark-assembly-1.2.1-hadoop1.0.4.jar/pyspark/rdd.py", line 1559, in combineByKey 
    File "/home/spark/spark/lib/spark-assembly-1.2.1-hadoop1.0.4.jar/pyspark/rdd.py", line 1942, in _defaultReducePartitions 
    File "/home/spark/spark/lib/spark-assembly-1.2.1-hadoop1.0.4.jar/pyspark/rdd.py", line 297, in getNumPartitions 
    ...... 
    py4j.protocol.Py4JJavaError: An error occurred while calling o23.partitions. 
    java.lang.RuntimeException: java.net.ConnectException: Call to a11/192.168.1.211:9000 failed on connection exception: java.net.ConnectException: Connection refused 
    ...... 

我没有用纱线或动物园管理员。所有虚拟机都可以通过ssh无密码连接。我还为主人和工作人员设置了SPARK_LOCAL_IP。

回答

0

我认为wordcount.py例如正在访问HDFS到读取线在一个文件中(然后算的话) 是这样的:

sc.textFile("hdfs://<master-hostname>:9000/path/to/whatever") 

端口9000通常用于HDFS。 请确保这个文件是可访问的或不使用hdfs的例子:)。 我希望它有帮助。

相关问题