3
当我尝试创建矢量变压器输出的标记点,我面临着以下问题:如何将ML稀疏矢量类型的变量转换为MLlib稀疏矢量类型?
val realout = output.select("label","features").rdd.map(row => LabeledPoint
row.getAs[Double]("label"),
row.getAs[org.apache.spark.mllib.linalg.SparseVector]("features")
))
我得到的错误是:
enter [error] (run-main-0) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 13, localhost): java.lang.ClassCastException: org.apache.spark.ml.linalg.SparseVector cannot be cast to org.apache.spark.mllib.linalg.Vector
[error] at DataCleaning$$anonfun$1.apply(DataCleaning.scala:107
[error] at DataCleaning$$anonfun$1.apply(DataCleaning.scala:105)
[error]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
[error]
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462
[error]
atorg.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:213)
我检查提供的解决方案链路1如下面提及的解释在火花2.0.0载体的转化,但面对编译错误,
object linalg is not a member of package org.apache.spark.ml
请帮助。谢谢 !
链接指向Java ..但你的回复是在斯卡拉 – hshihab
@hshihab自从Scala与Java兼容以来就没问题了。所以你可以在这两种语言中使用上面提到的方法。谢谢你的关心。 –
斯卡拉文档在这里:https://spark.apache.org/docs/2.0.2/api/scala/index.html#org.apache.spark.mllib.linalg.SparseVector$ –