2017-03-24 99 views
3

大家早上好。我试图用keras和pandas来实现这个LSTM算法来读取csv文件。我使用的后端是Tensorflow。在预测训练集之前,我在反转结果时遇到了问题。下面是我的代码LSTM-Keras错误:ValueError:具有形状(67704,1)的非广播输出操作数与广播形状(67704,12)不匹配

import numpy 
import matplotlib.pyplot as plt 
import pandas 
import math 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.layers import LSTM 
from sklearn.preprocessing import MinMaxScaler 
from sklearn.metrics import mean_squared_error 


#plt.plot(dataset) 
#plt.show() 

#fix random seed for reproducibility 
numpy.random.seed(7) 

#Load dataset 
col_names = ['UserID','SysTouchTime', 'EventTime', 'ActivityTouchID', 'Pointer_count', 'PointerID', 
       'ActionID', 'Touch_X', 'Touch_Y', 'Touch_Pressure', 'Contact_Size', 'Phone_Orientation'] 
dataframe = pandas.read_csv('touchEventsFor5Users.csv', engine='python', header=None, names = col_names, skiprows=1) 
#print(dataset.head()) 
#print(dataset.shape) 
dataset = dataframe.values 
dataset = dataframe.astype('float32') 
print(dataset.isnull().any()) 
dataset = dataset.fillna(method='ffill') 
feature_cols = ['SysTouchTime', 'EventTime', 'ActivityTouchID', 'Pointer_count', 'PointerID', 'ActionID', 'Touch_X', 'Touch_Y', 'Touch_Pressure', 'Contact_Size', 'Phone_Orientation'] 

X = dataset[feature_cols] 
y = dataset['UserID'] 
print(y.head()) 
#normalize the dataset 
scaler = MinMaxScaler(feature_range=(0, 1)) 
dataset = scaler.fit_transform(dataset) 

# split into train and test sets 

train_size = int(len(dataset) * 0.67) 
test_size = len(dataset) - train_size 
train, test = dataset[0:train_size, :], dataset[train_size:len(dataset),:] 
print(len(train), len(test)) 

# convert an array of values into a dataset matrix 
def create_dataset(dataset, look_back=1): 
    dataX, dataY = [], [] 
    for i in range(len(dataset)-look_back-1): 
     a = dataset[i:(i+look_back), 0] 
     dataX.append(a) 
     dataY.append(dataset[i + look_back, 0]) 
    return numpy.array(dataX), numpy.array(dataY) 

# reshape into X=t and Y=t+1 
look_back = 1 
trainX, trainY = create_dataset(train, look_back) 
testX, testY = create_dataset(test, look_back) 

#reshape input to be [samples, time steps, features] 
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) 
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1])) 

#create and fit the LSTM network 
model = Sequential() 
model.add(LSTM(4, input_dim=look_back)) 
model.add(Dense(1)) 
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy']) 
model.fit(trainX, trainY, epochs=1, batch_size=32, verbose=2) 

# make predictions 
trainPredict = model.predict(trainX) 
testPredict = model.predict(testX) 
# invert predictions 
import gc 
gc.collect() 

#####problem occurs with the following line of code############# 

trainPredict = scaler.inverse_transform(trainPredict) 

trainY = scaler.inverse_transform([trainY]) 
testPredict = scaler.inverse_transform(testPredict) 
testY = scaler.inverse_transform([testY]) 
# calculate root mean squared error 
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0])) 
print('Train Score: %.2f RMSE' % (trainScore)) 
testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0])) 
print('Test Score: %.2f RMSE' % (testScore)) 

#shift train predictions for plotting 
trainPredictPlot = numpy.empty_like(dataset) 
trainPredictPlot[:, :] = numpy.nan 
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict 
# shift test predictions for plotting 
testPredictPlot = numpy.empty_like(dataset) 
testPredictPlot[:, :] = numpy.nan 
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict 
# plot baseline and predictions 
plt.plot(scaler.inverse_transform(dataset)) 
plt.plot(trainPredictPlot) 
plt.plot(testPredictPlot) 
plt.show() 

,我得到的是

ValueError异常错误:与形状非broadcastable输出操作数(67704,1)广播形状不匹配(67704,12)

想你们可以帮我解决这个问题吗?我对此非常陌生,但想要学习它如此糟糕,而这个错误正在让我受苦!感谢您提供的任何帮助。

回答

2

当您缩放数据时,它将以不同的方式缩放12个字段。它将采用每个字段的最小值,并将其转换为0到1的值。

当你做一个invert_transform时,它对这个函数没有意义,因为你只给它一个字段,它不知道该如何处理它,最小值和最大值是多少......你需要提供一个12字段的数据集,这个预测字段在正确的位置。

尝试有问题​​的行之前添加此:

# create empty table with 12 fields 
trainPredict_dataset_like = np.zeros(shape=(len(train_predict), 12)) 
# put the predicted values in the right field 
trainPredict_dataset_like[:,0] = trainPredict[:,0] 
# inverse transform and then select the right field 
trainPredict = scaler.inverse_transform(trainPredict_dataset_like)[:,0] 

这是否帮助? :)

+0

谢谢你的回复!我加了你的代码的建议权之前: trainPredict = scaler.inverse_transform(trainPredict) 而这个错误是给予: 文件 “C:/Users/Gunn/PycharmProjects/airplaneLSTM/lstm.py”,行80,在 trainPredict_dataset_like [:,0] = trainPredict ValueError:无法从形状(67704,1)广播输入数组成形状(101055) – Jamiel

+0

不错。所以做到了这一点,它有同样的错误,但不同的数字,所以进步!这是错误: ValueError:无法将形状(67704,1)中的输入数组广播成形(67704)............是否可能是由于trainX是单维阵列? – Jamiel

+0

再次编辑:) –

相关问题