-3
我得到这个错误。请给我任何建议来解决它。这里是我的代码。我从train.csv traing数据和测试数据从另一个文件test.csv。我是机器学习的新手,所以我不明白什么是问题。任何建议。模型的特征数量必须与输入匹配。模型n_features是40和输入n_features是38
import quandl,math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
import datetime
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import metrics
train = pd.read_csv("train.csv", index_col=None)
test = pd.read_csv("test.csv", index_col=None)
vectorizer = CountVectorizer(min_df=1)
X1 = vectorizer.fit_transform(train['question'])
Y1 = vectorizer.fit_transform(test['testing'])
X=X1.toarray()
Y=Y1.toarray()
#print(Y.shape)
number=LabelEncoder()
train['answer']=number.fit_transform(train['answer'].astype('str'))
features = ['question','answer']
y = train['answer']
clf=RandomForestClassifier(n_estimators=100)
clf.fit(X[:25],y)
predicted_result=clf.predict(Y[17])
p_result=number.inverse_transform(predicted_result)
f = open('output.txt', 'w')
t=str(p_result)
f.write(t)
print(p_result)
雅dude.thanks其工作 – Shiv
@KapilSen如果它的帮助下,考虑接受的答案。 –
是啊完成!谢谢 – Shiv