在NLTK中为语料库查找路径

我正在使用自然语言工具包来编写Python程序。其中我正在尝试加载我自己文件的语料库。要做到这一点，我使用的代码如下效果：在NLTK中为语料库查找路径

from nltk.corpus import PlaintextCorpusReader 
corpus_root=(insert filepath here) 
wordlists=PlaintextCorpusReader(corpus_root, '.*')

比方说，我的文件被称为reader.py和我的文件的语料库位于在同一目录reader.py称为“语料库”目录。我想知道一种通用的方法来查找上面的文件路径，以便我的代码可以为使用代码的任何人查找任何位置的“corpus”目录的路径。我试过这些帖子，但他们只允许我获得绝对文件路径： Find current directory and file's directory

任何帮助将不胜感激！

来源

2013-06-27 MEric

据我了解

你reader.py文件和目录corpus总是在同一个目录
你正在寻找一种方式来从reader.py指corpus无论在哪里，你把它们放在你的目录结构

在这种情况下，the question that you referred to似乎是你所需要的。另一种方法是在this other answer。使用第二个选项，您的代码将被：

from nltk.corpus import PlaintextCorpusReader 
import os.path 
import sys 

basepath = os.path.dirname(__file__) 
corpus_root= os.path.abspath(os.path.join(basepath, "corpus")) 
wordlists=PlaintextCorpusReader(corpus_root, '.*')

记住的是，虽然创建的绝对路径，它基于上面的basepath = os.path.dirname(__file__)位得到的信息，这将产生的reader.py当前目录中创建。有些官方文档请查看the documentation。

来源

2013-06-27 18:08:47 arturomp

在NLTK中为语料库查找路径

回答

相关问题