2016-05-30 97 views
2

我试图延长sparkml带有过滤器变压器管道模型TypeTag,后星火据帧UDF无

abstract class RuleFilter[IN, T <: RuleFilter[IN, T]] 
    extends RuleTransformer with HasInputCol { 
    // def filterFuntion: String 
    /** @group setParam */ 
    def setInputCol(value: String): T = set(inputCol, value).asInstanceOf[T] 

    protected def createFilterFunc: IN => Boolean 

    override def transform(df: DataFrame): DataFrame = { 
    transformSchema(df.schema, logging = true) 
    val transformUDF = udf[Boolean, IN](this.createFilterFunc) 
    df.filter(transformUDF(df($(inputCol)))) 
    } 
} 

这段代码没有一个错误编译:

No TypeTag available for IN 
[error]  val transformUDF = udf[Boolean, IN](this.createFilterFunc) 

我该怎么办让这个工作?

我需要它在继承类中的一些明确的定义类型的工作,如

class PriceFilter extends RuleFilter { 
    def createFilterFunc(val: Double) = val > 500 
} 

回答

1

你需要明确地告诉编译器,你想TypeTagIn类型:

import scala.reflect.runtime.universe._ 
abstract class RuleFilter[In: TypeTag, T <: RuleFilter[In, T]] 
+0

的https: //github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L82,但是这个UnaryTransformer工作正常,没有明确的TypeTag,这是如何工作的? – tintin

+0

什么是udf?它对这种类型有什么作用? –