2017-06-02 55 views
0

你好下面是我的功能在PythonTFIDF在Python

def tf_idf(self,job_id,method='local'): 
    jobtext = self.get_job_text (job_id , method=method) 
    tfidf_vectorizer = TfidfVectorizer(max_df=0.8 , max_features=200000 , 
             min_df=0.2 , stop_words='english' , 
             use_idf=True , tokenizer=self.tokenize_and_stem(jobtext), ngram_range=(1, 3)) 
    #tfidf_vectorizer.fit(jobtext) 
    tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses 
    print(tfidf_matrix.shape) 

创建TFIDF矩阵,我收到以下错误:

回溯(最近通话最后一个):

File ".../employment_skills_extraction-master/api/process_request.py", line 206, in <module> 
main() 
    File ".../employment_skills_extraction-master/api/process_request.py", line 202, in main 
print pr.process(json.dumps(test)) 
    File ".../employment_skills_extraction-master/api/process_request.py", line 188, in process 
termVector=self.tf_idf(job_id) 
    File ".../employment_skills_extraction-master/api/process_request.py", line 174, in tf_idf 
tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 1285, in fit_transform 
X = super(TfidfVectorizer, self).fit_transform(raw_documents) 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 804, in fit_transform 
self.fixed_vocabulary_) 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 739, in _count_vocab 
for feature in analyze(doc): 
    File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 236, in <lambda> 
tokenize(preprocess(self.decode(doc))), stop_words) 
TypeError: 'list' object is not callable 

请帮助我为什么得到这个错误?

回答

0

TypeError: 'list' object is not callable看起来像错误的相关部分,它涉及您的变量job_id这可能不是你认为的那样。无论它应该是什么,它可能是一个列表(我不知道多久),其中包含你想要的东西。

如果插入的功能的第二线的线路和改变一个变量名,以保持其优雅是这样的:

job_id_element = job_id[0] 
jobtext = self.get_job_text (job_id_element , method=method) 

它可能会工作。

只要检查变量job_id的内容并考虑您是否想要它的第一个元素(我写的0)或最后一个len(job_id)是您需要的而不是0或可能是不同的。