我是kafka和spark的新手,我正在尝试做一些计数,但不成功!问题的细节如下。谢谢!Kafka + Java + SparkStreaming + reduceByKeyAndWindow抛出异常:org.apache.spark.SparkException:任务不可序列化
代码作为初级讲座:
JavaPairDStream<String,Integer> counts = wordCounts.reduceByKeyAndWindow(new AddIntegers(), new SubtractIntegers(), Durations.seconds(8000), Durations.seconds(4000));
异常如初级讲座:在线程 “螺纹-3” org.apache.spark.SparkException
例外:任务 在 org.apache不可串行化.spark.util.ClosureCleaner $ .ensureSerializable(ClosureCleaner.scala:166) at org.apache.spark.util.ClosureCleaner $ .clean(ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean( SparkContext.scala:1623)在 org.apache.spark.streaming.dstream.PairDStreamFunctions.reduceByKeyAndWindow(PairDStreamFunctions.scala:333) 在 org.apache.spark.streaming.dstream.PairDStreamFunctions.reduceByKeyAndWindow(PairDStreamFunctions.scala:299 ) 在 org.apache.spark.streaming.api.java.JavaPairDStream.reduceByKeyAndWindow(JavaPairDStream.scala:352) 在KafkaAndDstreamWithIncrement.KDDConsumer.run(KDDConsumer.java:110) 引起:java.io.NotSerializableException: KafkaAndDstreamWithIncrement.KDDConsumer
向我们展示'addIntegers'和'subtractIntegers' –
感谢您的建议!之前,我总是关注“如何覆盖reduceBykeyAndWindow”。但是现在我发现在addIntgers和subractIntegers中可能是错误的。我试过了,成功了,再次感谢你! –