从semcor语料库(http://www.cse.unt.edu/~rada/downloads.html),有些意义未映射到较高版本的wordnet。和神奇,映射可以在NLTK WordNet的API中找到这样:NLTK wordnet接口中的第0个同义词
>>> from nltk.corpus import wordnet as wn
# Emunerate the possible senses for the lemma 'delayed'
>>> wn.synsets('delayed')
[Synset('delay.v.01'), Synset('delay.v.02'), Synset('stay.v.06'), Synset('check.v.07'), Synset('delayed.s.01')]
>>> wn.synset('delay.v.01')
Synset('delay.v.01')
# Magically, there is a 0th sense of the word!!!
>>> wn.synset('delayed.a.0')
Synset('delayed.s.01')
我检查代码和API(http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.wordnet.Synset-class.html,http://nltk.org/_modules/nltk/corpus/reader/wordnet.html),但我无法找到他们是如何做到的神奇作图不应该不存在(例如,对于delayed.a.0
- >delayed.s.01
)。
有谁知道NLTK Wordnet API代码的哪一部分做了神奇的映射吗?