如何在凯拉斯的培训课程中保留指标值？

我有一个fit（）函数，它使用ModelCheckpoint（）回调来保存模型，如果它比以前的任何模型都好，使用save_weights_only = False，那么它将保存整个模型。这应该允许我通过使用load_model（）在稍后的日期恢复培训。如何在凯拉斯的培训课程中保留指标值？

不幸的是，在save（）/ load_model（）往返中的某处，值未被保留 - 例如，val_loss被设置为inf。这意味着，当训练恢复时，在第一个时间点ModelCheckpoint（）将始终保存模型，而这个模型几乎总是比前一届的冠军差。

我已经决定，我可以恢复训练之前设置ModelCheckpoint（）的当前最佳值，如下所示：

myCheckpoint = ModelCheckpoint(...) 
myCheckpoint.best = bestValueSoFar

很显然，我可以监视我所需要的值，并将其写入到一个文件，并在我恢复时再次阅读它们，但鉴于我是凯拉斯新手，我想知道我是否漏掉了一些明显的东西。

来源

2017-09-04 MadOverlord

如果你问题，你应该标记为'Answer'最有用的反应，所以它不再列为开放性问题。 – FlashTek

明天我不能那样做，但是谢谢你提醒我。 – MadOverlord

我最终很快写出了自己的回调函数，用于跟踪最佳训练值，以便重新加载它们。它看起来像这样：

# State monitor callback. Tracks how well we are doing and writes 
# some state to a json file. This lets us resume training seamlessly. 
# 
# ModelState.state is: 
# 
# { "epoch_count": nnnn, 
# "best_values": { dictionary with keys for each log value }, 
# "best_epoch": { dictionary with keys for each log value } 
# } 

class ModelState(callbacks.Callback): 

    def __init__(self, state_path): 

     self.state_path = state_path 

     if os.path.isfile(state_path): 
      print('Loading existing .json state') 
      with open(state_path, 'r') as f: 
       self.state = json.load(f) 
     else: 
      self.state = { 'epoch_count': 0, 
          'best_values': {}, 
          'best_epoch': {} 
         } 

    def on_train_begin(self, logs={}): 

     print('Training commences...') 

    def on_epoch_end(self, batch, logs={}): 

     # Currently, for everything we track, lower is better 

     for k in logs: 
      if k not in self.state['best_values'] or logs[k] < self.state['best_values'][k]: 
       self.state['best_values'][k] = float(logs[k]) 
       self.state['best_epoch'][k] = self.state['epoch_count'] 

     with open(self.state_path, 'w') as f: 
      json.dump(self.state, f, indent=4) 
     print('Completed epoch', self.state['epoch_count']) 

     self.state['epoch_count'] += 1

然后，在合适的（）函数，是这样的：

# Set up the model state, reading in prior results info if available 

model_state = ModelState(path_to_state_file) 

# Checkpoint the model if we get a better result 

model_checkpoint = callbacks.ModelCheckpoint(path_to_model_file, 
              monitor='val_loss', 
              save_best_only=True, 
              verbose=1, 
              mode='min', 
              save_weights_only=False) 


# If we have trained previously, set up the model checkpoint so it won't save 
# until it finds something better. Otherwise, it would always save the results 
# of the first epoch. 

if 'best_values' in model_state.state: 
    model_checkpoint.best = model_state.state['best_values']['val_loss'] 

callback_list = [model_checkpoint, 
       model_state] 

# Offset epoch counts if we are resuming training. If you don't do 
# this, only epochs-initial_epochs epochs will be done. 

initial_epoch = model_state.state['epoch_count'] 
epochs += initial_epoch 

# .fit() or .fit_generator, etc. goes here.

来源

2017-09-05 00:17:40 MadOverlord

我不认为，你必须自己存储度量值。在keras项目上有一个feature-request关于非常相似的东西，但它已经关闭。也许你可以尝试使用那里已经出现的解决方案。在keras的理念中，存储度量标准并不是非常有用，因为您只是保存了model这意味着：每个图层的体系结构和权重;不是历史或其他任何东西。

最简单的方法是创建一种metafile，其中包含模型的度量值和模型本身的名称。然后，您可以加载metafile，获得最佳度量值并获取产生它们的模型的名称，再次加载模型，以恢复培训。

来源

2017-09-04 21:31:40 FlashTek

感谢您指出功能请求。我最终做了类似的，下面的代码。 – MadOverlord

如何在凯拉斯的培训课程中保留指标值？

回答

相关问题