对不起,我是推荐系统的新手,但我用apache mahout lib写了几行代码。那么,我的数据集非常小,500x100与8102细胞已知。RMSE太小。推荐系统
因此,我的数据集实际上是来自“Yelp商业评级预测”竞争的Yelp数据集的一个子集。我只拿到了评级最高的100家餐厅,然后吸纳了500位最活跃的顾客。
我创建了SVDRecommender,然后我评估了RMSE。结果约为0.4 ...为什么它很小?也许我只是不明白的东西,我的数据集不是很稀疏,但后来我尝试了更大,更稀疏的数据集和RMSE变得更小(约0.18)!有人能解释我这种行为吗?
DataModel model = new FileDataModel(new File("datamf.csv"));
final RatingSGDFactorizer factorizer = new RatingSGDFactorizer(model, 20, 200);
final Factorization f = factorizer.factorize();
RecommenderBuilder builder = new RecommenderBuilder() {
public Recommender buildRecommender(DataModel model) throws TasteException {
//build here whatever existing or customized recommendation algorithm
return new SVDRecommender(model, factorizer);
}
};
RecommenderEvaluator evaluator = new RMSRecommenderEvaluator();
double score = evaluator.evaluate(builder,
null,
model,
0.6,
1);
System.out.println(score);