-1
我正在处理一个简单的逻辑回归问题。每个样本包含7423个特征。共4000个训练样本和1000个测试样本。 Sklearn需要0.01秒来训练模型并达到97%的精度,但Keras(TensorFlow后端)需要10秒才能达到50次相同的精度(甚至一个时期比sklearn慢20倍)。任何人都可以看到这个巨大的差距?为什么keras的方式比sklearn慢?
样品:
X_train: matrix of 4000*7423, 0.0 <= value <= 1.0
y_train: matrix of 4000*1, value = 0.0 or 1.0
X_test: matrix of 1000*7423, 0.0 <= value <= 1.0
y_test: matrix of 1000*1, value = 0.0 or 1.0
Sklearn代码:
from sklearn.linear_model.logistic import LogisticRegression
from sklearn.metrics import accuracy_score
classifier = LogisticRegression()
**# Finished in 0.01s**
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)
print('test accuracy = %.2f' % accuracy_score(predictions, y_test))
*[output]: test accuracy = 0.97*
Keras代码:
# Using TensorFlow as backend
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
**# Finished in 10s**
model.fit(X_train, y_train, batch_size=64, nb_epoch=50, verbose=0)
result = model.evaluate(X_test, y_test, verbose=0)
print('test accuracy = %.2f' % result[1])
*[output]: test accuracy = 0.97*
在CPU或GPU上进行培训?尝试批量大小= 4000 –