8
这是一个用scala编写的spark spark程序。它每隔1秒计算一次套接字中的字数。结果将是单词计数,例如,从0到1的单词计数,然后是从1到2的单词计数。但是我想知道是否有某种方法可以改变这个程序,以便我们可以累计字数?也就是说,从0到现在为止的字数。Spark Streaming累计字数
val sparkConf = new SparkConf().setAppName("NetworkWordCount")
val ssc = new StreamingContext(sparkConf, Seconds(1))
// Create a socket stream on target ip:port and count the
// words in input stream of \n delimited text (eg. generated by 'nc')
// Note that no duplication in storage level only for running locally.
// Replication necessary in distributed scenario for fault tolerance.
val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
ssc.awaitTermination()