2014-09-21 52 views
0

对不起这个问题,但我疯狂的驱动器错误“太多的值解压缩”。这是密码Python列表理解“太多值解包”

FREQ = 3 
fourgrams="" 
n = 4 
tokens = token_text(text) # is a function that tokenize 
fourgrams = ngrams(tokens, n) 
final_list = [(item,v) for item,v in nltk.FreqDist(fourgrams) if v > FREQ] 
print final_list 

错误在哪里?非常感谢

+2

请张贴满追踪,它会告诉你究竟在哪里引发异常。 – 2014-09-21 08:54:07

回答

2

FreqDist是一个类似字典的对象。迭代它会产生键(而不是键 - 值对)。如果你想重复这两个键值对,使用FreqDist.itemsFreqDist.iteritems

final_list = [(item,v) for item,v in nltk.FreqDist(fourgrams).items() if v > FREQ] 
+0

它的工作原理!谢谢 – RoverDar 2014-09-21 08:58:01

+0

@RoverDar,不客气。顺便说一句,正如Burhan Khalid评论的那样,在问题中包含完整的回溯会很好。 – falsetru 2014-09-21 09:00:07

1

在此请看:

from collections import Counter 

from nltk.corpus import brown 
from nltk.util import ngrams 

# Let's take the first 10000 words from the brown corpus 
text = brown.words()[:10000] 
# Extract the ngrams 
bigrams = ngrams(text, 2) 
# Alternatively, unstead of a FreqDist, you can simply use collections.Counter 
freqdist = Counter(bigrams) 
print len(freqdist) 
# Gets the top 5 ngrams 
top5 = freqdist.most_common()[:5] 
print top5 
# Limits v > 10 
freqdist = {k:v for k,v in freqdist.iteritems() if v > 10} 
print len(freqdist) 

[出]:

7615 
[(('of', 'the'), 95), (('.', 'The'), 76), (('in', 'the'), 59), (("''", '.'), 40), ((',', 'the'), 36)] 
34