Python - 生成单数名词的复数名词

如何使用NLTK模块写出名词的单数和复数形式，或者说在单词的txt文件中搜索时不要区分单数和复数？我可以使用NLTK使程序不区分大小写吗？Python - 生成单数名词的复数名词

2015-09-04 user5301912

您可以通过使用pattern.en做到这一点，不要太肯定NLTK

>>> from pattern.en import pluralize, singularize 
>>> 
>>> print pluralize('child') #children 
>>> print singularize('wolves') #wolf

看到more

来源

2015-09-04 18:53:21 taesu

真棒:) ......它可能是值得一提的youy也需要'pip安装模式' –

谢谢:)我一定会试试这个，但我仍然在做其他用途的nead NLTK。 – user5301912

您可以导入两者。我无法在NLTK – taesu

这里是一个可能的方式与NLTK做到这一点。想象一下，你正在寻找的字“功能”：

from nltk.stem import WordNetLemmatizer 
from nltk.tokenize import word_tokenize 

wnl = WordNetLemmatizer() 
text = "This is a small text, a very small text with no interesting features." 
tokens = [token.lower() for token in word_tokenize(text)] 
lemmatized_words = [wnl.lemmatize(token) for token in tokens] 
'feature' in lemmatized_words

案例敏感性处理了所有单词使用str.lower()，当然你也有在必要时lemmatize搜索词。

来源

2015-09-04 20:12:03

我可以将.lower（）直接添加到raw_input（'>'）吗？ – user5301912

是的，你可以做'raw_input（'>'）.lower（）'。 –

太好了。所以如果我添加.lower（）它会接受这个词，但是我输入它？像管理员管理员aDmin admiN等等？ – user5301912

模式目前正在写不支持Python 3的（虽然有关于这个在这里https://github.com/clips/pattern/issues/62正在进行的讨论。

TextBlob https://textblob.readthedocs.io是建立在模式和NLTK的顶部，还包括了多元化的功能，似乎做了不错的这份工作，虽然它并不完美，请参见下面的示例代码

from textblob import TextBlob 
words = "cat dog child goose pants" 
blob = TextBlob(words) 
plurals = [word.pluralize() for word in blob.words] 
print(plurals) 
# >>> ['cats', 'dogs', 'children', 'geese', 'pantss']

来源

2017-01-24 13:58:48 Sixhobbits

这可能是有点晚了回答，但以防万一有人还在寻找类似的东西：。

有inflect（也可在github）支持python 2.x和3.x. 你可以找到一个给定的词的单数或复数形式：

import inflect 
p = inflect.engine() 

words = "cat dog child goose pants" 
print([p.plural(word) for word in words.split(' ')]) 
# ['cats', 'dogs', 'children', 'geese', 'pant']

值得注意的是复数的p.plural会给你的单数形式。此外，还可以提供POS（部分语音）标签或提供数量和LIB确定它需要单复数：

p.plural('cat', 4) # cats 
p.plural('cat', 1) # cat 
# but also... 
p.plural('cat', 0) # cats

来源

2018-03-01 10:35:55

奇怪。 'inflect.engine（）。plural（'children'）'outputs'' childrens'' ...为什么？ –

是的，这个库在某些情况下有一些奇怪的行为，另外一个： 'inflect.engine（）。plural（'houses'）'outputs''housess'' 我不完全知道内部，我这些日子里，我实际上正在让自己穿过它。有一些非常好的工作案例，但也有一些看起来很明显的错误 –

Python - 生成单数名词的复数名词

回答

相关问题