在scikit-learn中训练神经网络时尽早停止

这个问题对Python库scikit-learn非常具体。请让我知道是否将它发布到其他地方是一个更好的主意。谢谢！在scikit-learn中训练神经网络时尽早停止

现在的问题......

我ffnn基于BaseEstimator我与SGD训练前馈神经网络类。它运行良好，我也可以使用GridSearchCV（）并行训练它。

现在我想实现在函数ffnn.fit（）中尽早停止，但为此我还需要访问fold的验证数据。这样做的一个办法是改变sklearn.grid_search.fit_grid_point（），它说

clf.fit(X_train, y_train, **fit_params)

成类似

clf.fit(X_train, y_train, X_test, y_test, **fit_params)

行，改变ffnn.fit（）把这些参数。这也会影响sklearn中的其他分类器，这是一个问题。我可以通过检查fit_grid_point（）中的某种标志来避免这种情况，该标志告诉我何时以上述两种方式调用clf.fit（）。

有人可以建议一个不同的方式来做到这一点，我不必编辑sklearn库中的任何代码？

或者，将X_train和y_train随机分为火车/验证集合并检查一个好的停靠点，然后在所有X_train上重新训练模型是否正确？

谢谢！

来源

2014-02-21 user1953384

通过使用train_test_split函数，您可以让神经网络模型在内部从已通过的X_train和y_train中提取验证集。

编辑：

或者，会是正确的进一步分裂X_train和y_train到火车/验证随机检查设置了一个良好的停车点，然后重新训练所有X_train的模式？

是的，但那样会很贵。你可以找到停靠点，然后只需要对你用来查找停靠点的验证数据进行一次额外的传递。

来源

2014-02-21 11:43:52 ogrisel

谢谢！ @ogrisel：验证数据是否足够通过？我怎样才能检查它是否可以通过多次传球获得更好的效果？ – user1953384

您可以将最终测试分数与原始但成本较高的测试分数进行比较。 – ogrisel

谢谢！对于这个微不足道的问题感到抱歉。这当然是要做的事:)。 – user1953384

有两种方式：

第一：

虽然采取了x_train和x_test分裂。你可以拿一个0。从x_train 1分，并保持它的有效性x_dev：

x_train, x_test, y_train, y_test = train_test_split(data_x, data_y, test_size=0.25) 

x_train, x_dev, y_train, y_dev = train_test_split(x_train, y_train, test_size=0.1) 

clf = GridSearchCV(YourEstimator(), param_grid=param_grid,) 
clf.fit(x_train, y_train, x_dev, y_dev)

而且你估计会像下面和实施早期停止与x_dev，y_dev

class YourEstimator(BaseEstimator, ClassifierMixin): 
    def __init__(self, param1, param2): 
     # perform initialization 
     # 

    def fit(self, x, y, x_dev=None, y_dev=None): 
     # perform training with early stopping 
     #

二

你不会在x_train上执行第二次分割，但会取出估计器的拟合方法中的开发集。

x_train, x_test, y_train, y_test = train_test_split(data_x, data_y, test_size=0.25) 

clf = GridSearchCV(YourEstimator(), param_grid=param_grid) 
clf.fit(x_train, y_train)

而且你估计将类似于以下内容：

class YourEstimator(BaseEstimator, ClassifierMixin): 
    def __init__(self, param1, param2): 
     # perform initialization 
     # 

    def fit(self, x, y): 
     # perform training with early stopping 
     x_train, x_dev, y_train, y_dev = train_test_split(x, y, 
                 test_size=0.1)

来源

2017-12-05 01:06:11

在scikit-learn中训练神经网络时尽早停止

回答

相关问题