将字符串与Python中的片段列表进行比较

我有一长串职位描述职位，我需要根据他们对组织的重要性进行筛选。我为此开发了一种简单的启发式。例如，如果标题包含像“管理员”或“导演”这样的词，这很重要。如果没有这个测试，如果它包含一个像“副”或“助手”这样的词，那么它就不重要了。将字符串与Python中的片段列表进行比较

这很容易用Python中的几行代码完成，但我想知道是否有更多的Pythonic方法来完成它。这是我现在的地方。

def in_fragment(phrase, fragments): 
    for fragment in fragments: 
     if fragment in phrase: 
      return True 
    return False

工作非常好，但如果可能的话，它会喜欢它正确的方式！谢谢。

来源

2012-12-03 Chris Wilson

你不能使用集？（或将您的列表转换为集）？你的解决方案没问题，只是设置它会更“干净” – BorrajaX

一种方式做到这一点会使用any：

def in_fragment(phrase, fragments): 
    return any(x in phrase for x in fragments)

来源

2012-12-03 22:02:25

+1，但是最好将它转换为'x in fragment for fragment in x' in phrase'，因为那样你可以平凡地改变'fragments列表成一个集合，它将是'O（N）'（其中N是短语中的单词数）而不是O（NM）（其中M是片段的数量）。 – abarnert

完美，谢谢！（提供的其他解决方案也非常好，并且信息丰富，所以请大家感谢。） –

嗯...大概FC的answer比我都快要编写更干净，但因为我测试了它在我的电脑上sets，这里有云：

#!/usr/bin/env python 

a="this is a letter for the administrator of the company" 
important = set(["administrator", "director"]) 

hits=important.intersection(set(a.split(" "))) 
if len(hits) > 0: 
    print "Wo! This is important. Found: %s" % (hits)

也许你会发现它很有用......的东西...... :)

来源

2012-12-03 22:06:34 BorrajaX

您可能会更好地编写'hits = important.intersection（set（a.split（“”））''以便您可以执行'len （命中）'和'...％s“％（命中）'，而不是连续两次完成所有工作（我不担心性能，除非他有很大的集合，但是关于可读性和可维护性。那么改变两个表达式中的一个就太容易了，而不是另一个） – abarnert

是的，你说得对，@abanert ......这主要是为了举例的目的，这就是为什么我没有太注意那个但你是绝对正确的（我用你的输入编辑了答案） – BorrajaX

def rankImportance(titles, fragments): 
    """titles is a list of job titles 
     fragments is a list of sets. 
     At index 0: set(['administrator', 'director']) 
     At index 1: set(['deputy', 'assistant']) 
     etc...""" 

    answer = collections.defaultdict(list) 
    while titles: 
     done = set() 
     for i,title in enumerate(titles): 
      for r,words in enumerate(fragments): 
       if any(word in title for word in words): 
        answer[r].append(title) 
        delete.add(i) 
     titles = [title for i,title in enumerate(titles) if i not in delete] 

    return answer

这应该返回一个字典，其中的键是作为职务列表的排序和值。排名值越小，则越重要。最小的排名将是0.

希望这会有所帮助

来源

2012-12-03 22:07:03 inspectorG4dget

将字符串与Python中的片段列表进行比较

回答

相关问题