1
我想根据词性来词串化,但在最后阶段,我收到一个错误。我的代码:引理字符串根据pos nlp
import nltk
from nltk.stem import *
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import wordnet
wordnet_lemmatizer = WordNetLemmatizer()
text = word_tokenize('People who help the blinging lights are the way of the future and are heading properly to their goals')
tagged = nltk.pos_tag(text)
def get_wordnet_pos(treebank_tag):
if treebank_tag.startswith('J'):
return wordnet.ADJ
elif treebank_tag.startswith('V'):
return wordnet.VERB
elif treebank_tag.startswith('N'):
return wordnet.NOUN
elif treebank_tag.startswith('R'):
return wordnet.ADV
else:
return ''
for word in tagged: print(wordnet_lemmatizer.lemmatize(word,pos='v'), end=" ")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-40-afb22c78f770> in <module>()
----> 1 for word in tagged: print(wordnet_lemmatizer.lemmatize(word,pos='v'), end=" ")
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\stem\wordnet.py in lemmatize(self, word, pos)
38
39 def lemmatize(self, word, pos=NOUN):
---> 40 lemmas = wordnet._morphy(word, pos)
41 return min(lemmas, key=len) if lemmas else word
42
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in _morphy(self, form, pos)
1710
1711 # 1. Apply rules once to the input to get y1, y2, y3, etc.
-> 1712 forms = apply_rules([form])
1713
1714 # 2. Return all that are in the database (and check the original too)
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in apply_rules(forms)
1690 def apply_rules(forms):
1691 return [form[:-len(old)] + new
-> 1692 for form in forms
1693 for old, new in substitutions
1694 if form.endswith(old)]
E:\Miniconda3\envs\uol1\lib\site-packages\nltk\corpus\reader\wordnet.py in <listcomp>(.0)
1692 for form in forms
1693 for old, new in substitutions
-> 1694 if form.endswith(old)]
1695
1696 def filter_forms(forms):
我希望能够基于每个单词的词类一次性地解析该字符串。请帮忙。
我不完全理解你的方法:你想在检查他们的POS之后推理词汇,以确保你得到正确的引理,是吗?如果是这样,你能否给出预期的投入和产出?另外,'get_wordnet_pos()'有什么意义 - 我没有看到它在任何地方使用? – patrick
看看https://gist.github.com/alvations/07758d02412d928414bb – alvas