2016-11-04 47 views
2

我有一个驱动程序,我使用spark从Cassandra读取数据,执行一些操作,然后在S3上写出JSON。当我使用Spark 1.6.1和spark-cassandra-connector 1.6.0-M1时,程序运行良好。但是,如果我尝试升级到Spark 2.0.1(hadoop 2.7.1)和spark-cassandra-connector 2.0.0-M3,程序将以所有预期文件写入S3的方式完成,但程序永远不会终止。Spark从Cassandra写入json后挂起

我在程序结束时运行sc.stop()。我也在使用Mesos 1.0.1。在这两种情况下,我都使用默认输出提交者。

编辑:看下面的线程转储,现在看来似乎可能会等上:org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner

代码片段:

// get MongoDB oplog operations 
val operations = sc.cassandraTable[JsonOperation](keyspace, namespace) 
    .where("ts >= ? AND ts < ?", minTimestamp, maxTimestamp) 

// replay oplog operations into documents 
val documents = operations 
    .spanBy(op => op.id) 
    .map { case (id: String, ops: Iterable[T]) => (id, apply(ops)) } 
    .filter { case (id, result) => result.isInstanceOf[Document] } 
    .map { case (id, document) => MergedDocument(id = id, document = document 
    .asInstanceOf[Document]) 
    } 

// write documents to json on s3 
documents 
    .map(document => document.toJson) 
    .coalesce(partitions) 
    .saveAsTextFile(path, classOf[GzipCodec]) 
sc.stop() 

线程转储的驱动程序:

60 context-cleaner-periodic-gc TIMED_WAITING 
46 dag-scheduler-event-loop WAITING 
4389 DestroyJavaVM RUNNABLE 
12 dispatcher-event-loop-0 WAITING 
13 dispatcher-event-loop-1 WAITING 
14 dispatcher-event-loop-2 WAITING 
15 dispatcher-event-loop-3 WAITING 
47 driver-revive-thread TIMED_WAITING 
3 Finalizer WAITING 
82 ForkJoinPool-1-worker-17 WAITING 
43 heartbeat-receiver-event-loop-thread TIMED_WAITING 
93 java-sdk-http-connection-reaper TIMED_WAITING 
4387 java-sdk-progress-listener-callback-thread WAITING 
25 map-output-dispatcher-0 WAITING 
26 map-output-dispatcher-1 WAITING 
27 map-output-dispatcher-2 WAITING 
28 map-output-dispatcher-3 WAITING 
29 map-output-dispatcher-4 WAITING 
30 map-output-dispatcher-5 WAITING 
31 map-output-dispatcher-6 WAITING 
32 map-output-dispatcher-7 WAITING 
48 MesosCoarseGrainedSchedulerBackend-mesos-driver RUNNABLE 
44 netty-rpc-env-timeout TIMED_WAITING 
92 org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner WAITING 
62 pool-19-thread-1 TIMED_WAITING 
2 Reference Handler WAITING 
61 Scheduler-1112394071 TIMED_WAITING 
20 shuffle-server-0 RUNNABLE 
55 shuffle-server-0 RUNNABLE 
21 shuffle-server-1 RUNNABLE 
56 shuffle-server-1 RUNNABLE 
22 shuffle-server-2 RUNNABLE 
57 shuffle-server-2 RUNNABLE 
23 shuffle-server-3 RUNNABLE 
58 shuffle-server-3 RUNNABLE 
4 Signal Dispatcher RUNNABLE 
59 Spark Context Cleaner TIMED_WAITING 
9 SparkListenerBus WAITING 
35 [email protected]/0 RUNNABLE 
36 [email protected]@3b5eaf92{HTTP/1.1}{0.0.0.0:4040} RUNNABLE 
37 [email protected]/1 RUNNABLE 
38 SparkUI-38 TIMED_WAITING 
39 SparkUI-39 TIMED_WAITING 
40 SparkUI-40 TIMED_WAITING 
41 SparkUI-41 RUNNABLE 
42 SparkUI-42 TIMED_WAITING 
438 task-result-getter-0 WAITING 
450 task-result-getter-1 WAITING 
489 task-result-getter-2 WAITING 
492 task-result-getter-3 WAITING 
75 threadDeathWatcher-2-1 TIMED_WAITING 
45 Timer-0 WAITING 

线程转储的执行者。这是对所有的人都一样:

24 dispatcher-event-loop-0 WAITING 
25 dispatcher-event-loop-1 WAITING 
26 dispatcher-event-loop-2 RUNNABLE 
27 dispatcher-event-loop-3 WAITING 
39 driver-heartbeater TIMED_WAITING 
3 Finalizer WAITING 
58 java-sdk-http-connection-reaper TIMED_WAITING 
75 java-sdk-progress-listener-callback-thread WAITING 
1 main TIMED_WAITING 
33 netty-rpc-env-timeout TIMED_WAITING 
55 org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner WAITING 
59 pool-17-thread-1 TIMED_WAITING 
2 Reference Handler WAITING 
28 shuffle-client-0 RUNNABLE 
35 shuffle-client-0 RUNNABLE 
41 shuffle-client-0 RUNNABLE 
37 shuffle-server-0 RUNNABLE 
5 Signal Dispatcher RUNNABLE 
23 threadDeathWatcher-2-1 TIMED_WAITING 
+0

你能SH向我们提供你的代码? –

+1

你可以使用Spark UI/executors页面的“线程转储”,并检查当时正在做的工作。 – maasg

回答

0

我解决了这个在我的程序罐子更新以下软件包:

  • 火花2.0.0至2.0.1
  • json4s 3.2.11到3.5.0
  • 扇贝2.0.1至2.0.5
  • nscala时间1.8.0至2.14.0