2011-10-09 21 views
0

我试图设置一个函数来计算两个电影的相似度的分数。现有的词典以电影为关键词,导演,流派或主演员都是价值观。有三部演员字典(每部电影的3名主角演员均被列出)。代码大多工作正常,但有时我得到的结果比我应该得到的更大。使用预先存在的字典添加到int值

# create a two-variable function to deterime the FavActor Similarity score: 
def FavActorFunction(film1,film2): 

    #set the result of the FavActor formula between two films to a default of 0. 
    FavActorScore = 0 
    #add 3 to the similarity score if the films have the same director. 
    if direct[film1] == direct[film2]: 
     FavActorScore += 3 
    #add 2 to the similarity score if the films are in the same genre. 
    if genre[film1] == genre[film2]: 
     FavActorScore += 2 
    #add 5 to the similarity score for each actor they have in common. 
    if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
     FavActorScore += 5 
    if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
     FavActorScore += 5  
    if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
     FavActorScore += 5 
    #print the resulting score.      
    return FavActorScore 

我的假设是,在统计他们有共同点的演员时,它会计算一些东西两次。有没有办法修改这部分代码,以获得更准确的结果?

if actor1[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
    FavActorScore += 5 
if actor2[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
    FavActorScore += 5  
if actor3[film1] == actor1[film2] or actor2[film2] or actor3[film2]: 
    FavActorScore += 5  
+2

我真的*真的*不得不问:这些愚蠢的数据结构来自哪里?电影应该是字典,属性应该是关键(并且演员应该在一个序列或集合中)。 –

回答

1

尝试用in条件:

if actor1[film1] in (actor1[film2], actor2[film2], actor3[film2]): 
    FavActorScore += 5 
if actor2[film1] in (actor1[film2], actor2[film2], actor3[film2]): 
    FavActorScore += 5 
if actor3[film1] in (actor1[film2], actor2[film2], actor3[film2]): 
    FavActorScore += 5 

当你写a==b or c or d如果等于B,或者c为真,或者如果是,如果d为真不是真的等于这是真的b或c或d。