2014-09-03 55 views
9

所以我建立星火1.0.0隐式反馈推荐的模型,我试图按照他们有他们的协同过滤页面上的例子: http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#explicit-vs-implicit-feedback星火MLlib - 协同过滤隐饲料

而且我甚至有的测试数据集装起来它们在例如参考: http://codesearch.ruethschilling.info/xref/apache-foundation/spark/mllib/data/als/test.data

然而,当我尝试运行隐式反馈模型: VAL阿尔法= 0.01 VAL模型= ALS.trainImplicit(评分,秩,numIterations,阿尔法)

(收视率从他们的数据集和秩= 10,正是收视率numIterations = 20),我收到以下错误:

scala> val model = ALS.trainImplicit(ratings, rank, numIterations, alpha) 
<console>:26: error: overloaded method value trainImplicit with alternatives: 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double)org.apache.spark.mllib.recommendation.MatrixFactorizationModel <and> 
(ratings: org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating],rank: Int,iterations: Int,lambda: Double,blocks: Int,alpha: Double,seed: Long)org.apache.spark.mllib.recommendation.MatrixFactorizationModel 
cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.mllib.recommendation.Rating], Int, Int, Double) 
val model = ALS.trainImplicit(ratings, rank, numIterations, alpha) 

有趣的是,这种模式运行时没有做trainImplicit就好了(即ALS.train)

回答

4

该示例似乎与实现不同步,因为没有带有四个参数的trainImplicit超载 - 这是错误消息告诉您的。但是,如果你看一下Scala source code for ALS你会看到这三个参数超载在六个参数超载方面实现通过一些“幻数”:

def trainImplicit(ratings: RDD[Rating], rank: Int, iterations: Int) 
    : MatrixFactorizationModel = { 
    trainImplicit(ratings, rank, iterations, 0.01, -1, 1.0) 
} 

这表明,0.01是一个体面的默认值拉姆达。 (或许可以与更深入了解ML的人一起检查)。这可能会给你足够的信息来合理调用五个或六个参数过载。 (当然,如果你有足够的知识挑更好的价值,这是伟大的!)

例如:

val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, alpha) 

val model = ALS.trainImplicit(ratings, rank, numIterations, 0.01, -1, alpha) 

最后,你可能没有意识到,有相当不错的API documentaiton for ALS

+0

完美的,'神奇数字'计算似乎工作得很好!非常感谢你的帮助!! – atellez 2014-09-03 20:18:52

+0

是的0.01对于lambda来说是一个很好的默认值。 – 2014-09-03 20:31:00