0
我试图设置一个函数来计算两个电影的相似度的分数。现有的词典以电影为关键词,导演,流派或主演员都是价值观。有三部演员字典(每部电影的3名主角演员均被列出)。代码大多工作正常,但有时我得到的结果比我应该得到的更大。使用预先存在的字典添加到int值
# create a two-variable function to deterime the FavActor Similarity score:
def FavActorFunction(film1,film2):
#set the result of the FavActor formula between two films to a default of 0.
FavActorScore = 0
#add 3 to the similarity score if the films have the same director.
if direct[film1] == direct[film2]:
FavActorScore += 3
#add 2 to the similarity score if the films are in the same genre.
if genre[film1] == genre[film2]:
FavActorScore += 2
#add 5 to the similarity score for each actor they have in common.
if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
#print the resulting score.
return FavActorScore
我的假设是,在统计他们有共同点的演员时,它会计算一些东西两次。有没有办法修改这部分代码,以获得更准确的结果?
if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]:
FavActorScore += 5
我真的*真的*不得不问:这些愚蠢的数据结构来自哪里?电影应该是字典,属性应该是关键(并且演员应该在一个序列或集合中)。 –