0

我是机器学习新手,我正在研究一个使用数据集来分类扑克手的python应用程序,我将发布片段。它似乎不工作。而且我收到以下错误:MLP分类拟合

Traceback (most recent call last): 
    File "C:\Users\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code 
    exec(code_obj, self.user_global_ns, self.user_ns) 
    File "<ipython-input-62-0d21cd839ce4>", line 1, in <module> 
    mlp.fit(X_test, y_train.values.reshape(len(y_train), 1)) 
    File "C:\Users\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 618, in fit 
    return self._fit(X, y, incremental=False) 
    File "C:\Users\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 330, in _fit 
    X, y = self._validate_input(X, y, incremental) 
    File "C:\Users\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 902, in _validate_input 
    multi_output=True) 
    File "C:\Users\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 531, in check_X_y 
    check_consistent_length(X, y) 
    File "C:\Users\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 181, in check_consistent_length 
    " samples: %r" % [int(l) for l in lengths]) 
ValueError: Found input variables with inconsistent numbers of samples: [6253, 18757] 

这里是我想产生的代码:

import pandas as pnd 
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import StandardScaler 
from sklearn.neural_network import MLPClassifier 
from sklearn.metrics import classification_report, confusion_matrix 

training_data = pnd.read_csv("train.csv") 
training_data['id'] = range(1, len(training_data) + 1) # For 1-base index 
training_datafile = training_data 
target = training_datafile['hand'] 
data = training_datafile.drop(['id', 'hand'], axis=1) 
X = data 
y = target 
X_train, X_test, y_train, y_test = train_test_split(X, y) 
X_train.shape 
y_train.shape 
scaler = StandardScaler() 
scaler.fit(X_train) 
X_train = scaler.transform(X_train) 
X_test = scaler.transform(X_test) 
mlp = MLPClassifier(hidden_layer_sizes=(100, 100, 100)) 
mlp.fit(X_test, y_train.values.reshape(len(y_train), 1)) 
predictions = mlp.predict(X_test) 
len(mlp.coefs_) 
len(mlp.coefs_[0]) 
len(mlp.intercepts_[0]) 
print(confusion_matrix(y_test, predictions)) 
print(classification_report(y_test, predictions)) 

X_train.shape的形状(18757,10)和y_train的形状。形状(18757) 我已经尝试使用以下以前的帖子

y_train.values.reshape(len(y_train), 1) 

但我仍然得到同样的错误。因为我不确定形状有什么问题,所以一些指导会有很大的帮助。

数据片段: enter image description here

回答

1

您fiting X_test而不是X_train

mlp.fit(X_train, y_train.values.reshape(len(y_train), 1))