查找语料库类别中的单词上下文不起作用

我写了这个小脚本来查找我的语料库中10个最常用单词的上下文。但它不起作用，我不知道我在做什么错误。tien_frequentste（mijn_corpus）定义在它自己的工作。查找语料库类别中的单词上下文不起作用

tienfrequentste = tien_frequentste(mijncorpus) 
def context (corpus, most_freq): 
    for category in corpus.categories(): 
    print "Context voor" , category, ":" 
     for word in most_freq: 
      print nltk.Text(corpus.words(categories=category)).concordance(word)

UPDATE：
我得到一个错误信息的回溯
为context(corpus, most_freq)，
为category in corpus.categories()，
为self.init()
和in_init。和AttributeError:'NoneType' object has no attribute 'group'。
不知道这些错误的意思..

Traceback (most recent call last): 
File "/Users/...document.py", line 92, in <module> context (mijn_corpus, tienfrequentste) 

File "/Users/...document.py", line 87, in context for category in corpus.categories(): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nltk.corpus.reader.api.py, line 317, in categories self.init(). 

File "/Users/...document.py", line 87, in context for category in corpus.categories(): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nltk.corpus.reader.api.py, line 289, in_init category = re.match(self._pattern, file id).group(1) 

attributeError: 'Nonetype' object has no attribute "group"

来源

2013-01-24 JohnDoe

它只是不工作？你是否越来越错误？你提供的信息越多越容易帮助 –

@Gareth Webber我编辑了我的问题，并提供了有关错误消息的信息。 – JohnDoe

为什么不直接在这里复制粘贴错误，知道确切的引用会更容易理解这个问题 – pradyunsg

贵语料库有类别和是most_freq字符串列表？以下示例有效：

from nltk.corpus import reuters 
for category in reuters.categories(): 
print "context voor", category, " : " 
for word in ["get", "have", "do"]: 
    print nltk.Text(reuters.words(categories=category)).concordance(word)

来源

2013-01-27 23:24:31

错误来自将语料库文件分配到类别的正则表达式。它正在绊倒与正则表达式模式不匹配的文件名。如果您使用带有类别的标准NLTK语料库，则必须在语料库目录中放置额外的文件。如果您使用自己的语料库，则配置错误。

顺便提一句，concordance()打印其输出并返回None。如果您使用print，则会看到大量的None值。

来源

2013-02-03 21:48:00 alexis

查找语料库类别中的单词上下文不起作用

回答

相关问题