2017-08-22 47 views
2

如何在MXNET中创建自定义丢失功能?例如,不是计算一个标签的交叉熵损失(使用计算交叉熵损失的标准mx.sym.SoftmaxOutput图层并返回一个可作为损失符号传递给拟合函数的符号),我想计算加权交叉熵损失为每个可能的标签。该MXNET教程提到使用MXNET自定义丢失功能和eval_metric

mx.symbol.MakeLoss(scalar_loss_symbol, normalization='batch') 

然而,当我使用MakeLoss功能,标准eval_metric - "acc"不工作(显然由于模型不知道什么是我的预测概率向量)。所以我需要写我自己的eval_metric。另外,在预测时,我还需要预测概率向量,除非我将带有丢失符号的最终概率向量和其上的block_grad分组,否则不能被访问。

回答

1

下面的代码是对MXNET教程http://mxnet.io/tutorials/python/mnist.html的修改,其中标准SoftmaxOutput丢失函数被重写为自定义加权损失函数,并且写入了所需的自定义eval_metric。

import logging 
logging.getLogger().setLevel(logging.DEBUG) 
import mxnet as mx 
import numpy as np 
mnist = mx.test_utils.get_mnist() 

batch_size = 100 
weighted_train_labels =  
np.zeros((mnist['train_label'].shape[0],np.max(mnist['train_label'])+ 1)) 
weighted_train_labels[np.arange(mnist['train_label'].shape[0]),mnist['train_label']] = 1 
train_iter = mx.io.NDArrayIter(mnist['train_data'], {'label':weighted_train_labels}, batch_size, shuffle=True) 

weighted_test_labels = np.zeros((mnist['test_label'].shape[0],np.max(mnist['test_label'])+ 1)) 
weighted_test_labels[np.arange(mnist['test_label'].shape[0]),mnist['test_label']] = 1 
val_iter = mx.io.NDArrayIter(mnist['test_data'], {'label':weighted_test_labels}, batch_size) 

data = mx.sym.var('data') 
# first conv layer 
conv1 = mx.sym.Convolution(data=data, kernel=(5,5), num_filter=20) 
tanh1 = mx.sym.Activation(data=conv1, act_type="tanh") 
pool1 = mx.sym.Pooling(data=tanh1, pool_type="max", kernel=(2,2), stride=(2,2)) 
# second conv layer 
conv2 = mx.sym.Convolution(data=pool1, kernel=(5,5), num_filter=50) 
tanh2 = mx.sym.Activation(data=conv2, act_type="tanh") 
pool2 = mx.sym.Pooling(data=tanh2, pool_type="max", kernel=(2,2), stride=(2,2)) 
# first fullc layer 
flatten = mx.sym.flatten(data=pool2) 
fc1 = mx.symbol.FullyConnected(data=flatten, num_hidden=500) 
tanh3 = mx.sym.Activation(data=fc1, act_type="tanh") 
# second fullc 
fc2 = mx.sym.FullyConnected(data=tanh3, num_hidden=10) 
# softmax loss 
#lenet = mx.sym.SoftmaxOutput(data=fc2, name='softmax') 

label = mx.sym.var('label') 
softmax = mx.sym.log_softmax(data=fc2) 
softmax_output = mx.sym.BlockGrad(data = softmax,name = 'softmax') 
ce = ce = -mx.sym.sum(mx.sym.sum(mx.sym.broadcast_mul(softmax,label),1)) 
lenet = mx.symbol.MakeLoss(ce, normalization='batch') 

sym = mx.sym.Group([softmax_output,lenet]) 
print sym.list_outputs 

def custom_metric(label,softmax): 
    return len(np.where(np.argmax(softmax,1)==np.argmax(label,1))[0])/float(label.shape[0]) 

eval_metrics = mx.metric.CustomMetric(custom_metric,name='custom-accuracy', output_names=['softmax_output'],label_names=['label']) 

lenet_model = mx.mod.Module(symbol=sym, context=mx.gpu(),data_names=['data'], label_names=['label']) 
lenet_model.fit(train_iter, 
       eval_data=val_iter, 
       optimizer='sgd', 
       optimizer_params={'learning_rate':0.1}, 
       eval_metric=eval_metrics,#mx.metric.Loss(),#'acc', 
       #batch_end_callback = mx.callback.Speedometer(batch_size, 100), 
       num_epoch=10)