我在下面的格式有星火RDD星火CombineByKey
样品RDD:
Array[(String, (String, Double))] = Array((2014-01-12 00:00:00.0,("XXX",829.95)), (2013-08-28 00:00:00.0,("YYY",469.95000000000005)), (2013-11-01 00:00:00.0,("ZZZ",129.99)), (2013-07-25 00:00:00.0,("XYZ",879.8599999999999)), (2013-10-19 00:00:00.0,("POI",989.94)))
我试图用combineByKey总结双值从RDD给定键和下方尝试命令
rdd.combineByKey((x:String,y:Double) => (x,y) ,(acc:(String,Double),valu:(String,Double)) => acc._2+valu._2, (acc2:(Double),acc3:(Double)) => (acc2+acc3)
但提示以下错误:....
:46: error: overloaded method value combineByKey with alternatives: [C](createCombiner: ((String, Double)) => C, mergeValue: (C, (String, Double)) => C, mergeCombiners: (C, C) => C)org.apache.spark.rdd.RDD[(String, C)] [C](createCombiner: ((String, Double)) => C, mergeValue: (C, (String, Double)) => C, mergeCombiners: (C, C) => C, numPartitions: Int)org.apache.spark.rdd.RDD[(String, C)] [C](createCombiner: ((String, Double)) => C, mergeValue: (C, (String, Double)) => C, mergeCombiners: (C, C) => C, partitioner: org.apache.spark.Partitioner, mapSideCombine: Boolean, serializer: org.apache.spark.serializer.Serializer)org.apache.spark.rdd.RDD[(String, C)] cannot be applied to ((String, Double) => (String, Double), ((String, Double), (String, Double)) => Double, (Double, Double) => Double) custMaxOrdr.combineByKey((x:String,y:Double) => (x,y) ,(acc:(String,Double),valu:(String,Double)) => acc._2+valu._2, (acc2:(Double),acc3:(Double)) => (acc2+acc3))
任何帮助表示赞赏。
感谢 Rammy
感谢琐!它有效,,, :-) – Rammy
很高兴帮助:)请upvote /接受答案,让其他用户知道这是解决。 –