我想提取一组交易的关联规则与下面的代码火花斯卡拉:协会规则,频繁模式挖掘
val fpg = new FPGrowth().setMinSupport(minSupport).setNumPartitions(10)
val model = fpg.run(transactions)
model.generateAssociationRules(minConfidence).collect()
但产品数量都超过10K所以提取的规则对所有组合具有计算表现力,我也不需要它们。所以我想只提取成对:
Product 1 ==> Product 2
Product 1 ==> Product 3
Product 3 ==> Product 1
,我不关心其他组合,如:
[Product 1] ==> [Product 2, Product 3]
[Product 3,Product 1] ==> Product 2
有没有办法做到这一点?
感谢, 阿米尔
顺便说一句,我正在做Spark-Scala – Amir