Python在文本文件中搜索确切的单词/短语。 - 新手

目前，我正在尝试在文本文件中搜索确切的单词/短语。我正在使用Python 3.4Python在文本文件中搜索确切的单词/短语。 - 新手

这是我到目前为止的代码。

import re 

def main(): 
    fileName = input("Please input the file name").lower() 
    term = input("Please enter the search term").lower() 

    fileName = fileName + ".txt" 

    regex_search(fileName, term) 

def regex_search(file,term): 
    source = open(file, 'r') 
    destination = open("new.txt", 'w') 
    lines = [] 
    for line in source: 
     if re.search(term, line): 
      lines.append(line) 

    for line in lines: 
     destination.write(line) 
    source.close() 
    destination.close() 
''' 
def search(file, term): #This function doesn't work 
    source = open(file, 'r') 
    destination = open("new.txt", 'w') 
    lines = [line for line in source if term in line.split()] 

    for line in lines: 
     destination.write(line) 
    source.close() 
    destination.close()''' 
main()

在我的功能regex_search我用正则表达式来搜索特定的字符串。但是，我不知道如何搜索特定的短语。

在第二个函数search中，我将行分割成一个列表并在那里搜索单词。但是，这将无法搜索特定的短语，因为我正在搜索['the'，'dog'，'walked']中的[“dog walked”]，这将不会返回正确的行。

来源

2014-12-03 Kai Mou

如果你搜索“foo”和文字有“foobar的”，是考虑一场比赛？如果您搜索“富酒吧”，一行以“富”结尾，下一行以“酒吧”开头，这是否被认为是匹配？ – 2014-12-03 23:01:46

你能提供一个输入文件（或其内容）和感兴趣的短语的例子吗？ – Marcin 2014-12-03 23:26:45

@Brian Oakley no – 2014-12-03 23:47:26

编辑：考虑到你不想匹配部分词（'foo'不应该匹配'foobar'），你需要在数据流中向前看。该代码是有点尴尬，所以我觉得正则表达式（与修订当前的regex_search）是要走的路：

def regex_search(filename, term): 
    searcher = re.compile(term + r'([^\w-]|$)').search 
    with open(file, 'r') as source, open("new.txt", 'w') as destination: 
     for line in source: 
      if searcher(line): 
       destination.write(line)

来源

2014-12-03 23:30:28 tdelaney

因此，在这种情况下，当我搜索no并且该行没有时会发生什么？难道它不是回到了不是没有的路线吗？ – 2014-12-03 23:46:51

'不'会与'not'匹配 - 与您的'regex_search'示例相同。如果这不是你想要的，请告诉我们。 – tdelaney 2014-12-03 23:50:20

我正在寻找没有只匹配没有。与短语相同。 – 2014-12-04 00:40:42

Python在文本文件中搜索确切的单词/短语。 - 新手

回答

相关问题