NLTK使用训练分类

分类界面我的代码，这个小块，我发现here：NLTK使用训练分类

import nltk.classify.util 
from nltk.classify import NaiveBayesClassifier 
from nltk.corpus import movie_reviews 
from nltk.corpus import stopwords 

def word_feats(words): 
    return dict([(word, True) for word in words]) 

negids = movie_reviews.fileids('neg') 
posids = movie_reviews.fileids('pos') 

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids] 
posfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'pos') for f in posids] 

negcutoff = len(negfeats)*3/4 
poscutoff = len(posfeats)*3/4 

trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] 
testfeats = negfeats[negcutoff:] + posfeats[poscutoff:] 
print 'train on %d instances, test on %d instances' % (len(trainfeats), len(testfeats)) 

classifier = NaiveBayesClassifier.train(trainfeats) 
print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats) 
classifier.show_most_informative_features()

但我怎么能分类随机单词，可能是在语料库。

classifier.classify('magnificent')

不起作用。它需要某种对象吗？

非常感谢。

编辑：多亏@ unutbu的反馈和一些挖here并在原帖如下产量的POS“或“NEG”这个代码（这一个是一个“正”）阅读注释

print(classifier.classify(word_feats(['magnificent'])))

和这产生单词的评价为 'POS' 或 '负'

print(classifier.prob_classify(word_feats(['magnificent'])).prob('neg'))

来源

2013-02-05 storedope

print(classifier.classify(word_feats(['magnificent'])))

产生

pos

classifier.classify方法不会对单个词本身进行操作，它根据的dict特征进行分类。在此示例中，word_feats将一个句子（单词列表）映射到要素的dict。

这是another example（来自NLTK书），它使用NaiveBayesClassifier。通过比较该示例和发布的示例之间的相似和不同之处，您可以更好地了解它如何使用。

来源

2013-02-05 20:58:54 unutbu

NLTK使用训练分类

回答

相关问题