-2
我试过地图,mapValues和排序,但没有任何作品。 问题描述如下: “通过相似性(值中的第二个)”,如果相同,则选择具有最小ID的用户(该值中的第一个)。“ 和键 - 值对的列表是:如何在pyspark中执行这个排序过程?
[
(18, [(2, 0.5)]),
(30, [(19, 0.5), (6, 0.25)]),
(6, [(30, 0.25), (20, 0.2), (19, 0.2)]),
(19, [(30, 0.5), (8, 0.2), (6, 0.2)]),
(2, [(18, 0.5)]),
(26, [(9, 0.2)]),
(9, [(26, 0.2)])
]
我想:
[
(18, [(2, 0.5)]),
(30, [(19, 0.5), (6, 0.25)]),
(6, [(30, 0.25), (19, 0.2)]),
(19, [(30, 0.5), (6, 0.2)]),
(2, [(18, 0.5)]),
(26, [(9, 0.2)]),
(9, [(26, 0.2)])
]
谢谢你很多!