的Python：用一个特定的词

博士。 goldberg offers everything.parking is good.he's nice and easy to talk

如何提取带有关键字“parking”的句子？我不需要另外两句话。

我尝试这样做：

with open("test_data.json") as f: 
    for line in f: 
     if "parking" in line: 
      print line

它打印出的所有文字，而不是特定的句子。

f=open("test_data.json") 
for line in f: 
    line=line.rstrip() 
    if re.search('parking',line): 
     print line

即使这显示了同样的结果：

我甚至使用正则表达式尝试。

来源

2014-11-22 dipit malhotra

当您在文件指针使用输入行，它不会只读一个线。它会一直读到它看到“\ n”。 – Myjab 2014-11-22 07:00:51

使用简单的正则表达式。使用dmitry_romanov提到的模式，甚至可以尝试模式re.search（“。* \。（。* parking。* \。）”，a）.group（1） – Myjab 2014-11-22 07:22:14

您可以使用标准库re模块：

import re 
line = "dr. goldberg offers everything.parking is good.he's nice and easy to talk" 
res = re.search("\.?([^\.]*parking[^\.]*)", line) 
if res is not None: 
    print res.group(1)

这将打印parking is good。

想法很简单 - 您从可选的点字符.开始搜索句子，而不是消耗所有非点，parking字和其他非点。

问号处理您的句子在行首的情况。

来源

2014-11-22 07:10:52

但是，间断的缩写，比如前面的句子在输入。 – tripleee 2014-11-22 07:39:14

@tripleee，恐怕没有意义的语法。 'dr.'中的Dot'.'与任何句子末尾的相同。如果有人需要能够像人一样阅读的解决方案，他/她要么编写脆弱的正则表达式，要么训练一个神经网络。这两种情况都是矫枉过正的，恕我直言。可能''dr'是'delta r'，就像phys书中的一样，谁知道？我的解决方案将处理逗号等终止与！，？很容易添加等等。 – 2014-11-22 17:41:02

对于一个标签为[tag：nltk]的问题，我希望并希望能够解决至少处理实际人类语言基础知识的解决方案。是的，这是依赖于上下文的，所以诸如正则表达式之类的上下文自由工具本质上是不够的。 – tripleee 2014-11-22 19:05:26

如何解析字符串并查看值？

import json 

def sen_or_none(string): 
    return "parking" in string.lower() and string or None 

def walk(node): 
    if isinstance(node, list): 
    for item in node: 
     v = walk(item) 
     if v: 
     return v 
    elif isinstance(node, dict): 
    for key, item in node.items(): 
     v = walk(item) 
     if v: 
     return v 
    elif isinstance(node, basestring): 
    for item in node.split("."): 
     v = sen_or_none(item) 
     if v: 
     return v 
    return None 

with open('data.json') as data_file:  
    print walk(json.load(data_file))

来源

2014-11-22 07:12:43 abeaamase

您可以使用nltk.tokenize：

from nltk.tokenize import sent_tokenize 
from nltk.tokenize import word_tokenize 
f=open("test_data.json").read() 
sentences=sent_tokenize(f) 
my_sentence=[sent for sent in sentences if 'parking' in word_tokenize(sent)] #this gave you the all sentences that your special word is in it !

，并作为一个完整的方式，你可以使用一个函数：

>>> def sentence_finder(text,word): 
... sentences=sent_tokenize(text) 
... return [sent for sent in sentences if word in word_tokenize(sent)] 

>>> s="dr. goldberg offers everything. parking is good. he's nice and easy to talk" 
>>> sentence_finder(s,'parking') 
['parking is good.']

来源

2014-11-22 07:13:03 Kasramvd

的Python：用一个特定的词

回答

相关问题