2015-12-02 138 views
0

如果我以“sbt run”运行,下面的简单spark程序运行得非常好。但如果我跑,Spark Streaming throwing java.net.ConnectException

1)作为“spark-submit.cmd eventfilter-assembly-0.1-SNAPSHOT.jar”。其中jar是使用“sbt assembly”与“streaming and sql”的sbt规则创建的“%provided”创建的。

2)作为“spark-submit.cmd --jar play-json_2.10-2.3.10.jar seventfilter_2.10-0.1-SNAPSHOT.jar”。

这两种情况下,它都会启动并等待新文件的到来。直到现在没有问题。

但是,只要我开始放置文件,以便它可以流式传输,下面的异常来了。

注:我使用的火花1.4.1彬hadoop2.6,

注:如果说我是“SBT运行”运行时,它运行平稳几个小时。

注意:我也尝试过1.5.2,通过相应地更改sbt文件。行为是一样的。

Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.ConnectException: Connection timed out: connect 
    at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) 
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85) 
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) 
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) 
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) 
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) 
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 
    at java.net.Socket.connect(Socket.java:589) 
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175) 
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) 
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) 
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) 
    at sun.net.www.http.HttpClient.New(HttpClient.java:308) 
    at sun.net.www.http.HttpClient.New(HttpClient.java:326) 
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168) 
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104) 
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998) 
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932) 
    at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:639) 
    at org.apache.spark.util.Utils$.fetchFile(Utils.scala:453) 
    at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:398) 
    at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:390) 
    at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) 
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
    at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) 
    at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) 
    at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) 
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) 
    at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:390) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Driver stacktrace: 

SBT文件


scalaVersion := "2.10.4" 
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.4.1" 
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.4.1" 
libraryDependencies += "com.typesafe.play" %% "play-json" % "2.3.10" 

斯卡拉文件

def main(args: Array[String]) { 

    val conf = new SparkConf() 
    conf 
     .setMaster("local[*]") //Comment if executing through spark-submit 
     .setAppName("test") 
    val sc = new SparkContext(conf) 
    val ssc = new StreamingContext(sc, Seconds(3)) 

    val dStream = ssc.textFileStream("dir") 
    val expandedEventStream = dStream.count().print() 
    ssc.start() 
    ssc.awaitTermination() 
} 

回答

0

我有virtuaBox设置,默认情况下火花塞使用的IP查找应用程序罐子。禁用virtualBox IP后,spark开始正常工作。