2016-12-05 47 views
0

我想建立使用theano了深刻的网络。但准确度为零。我无法弄清楚我的错误。我正在尝试创建一个带有3个隐藏层和一个输出的深度学习网络。我打算做一个分类任务,我有5个班。因此,输出层有5个节点。深层网络产生零精度

有什么建议吗?

#!/usr/bin/env python 

from __future__ import print_function 

import theano 
import theano.tensor as T 
import lasagne 
import numpy as np 
import sklearn.datasets 
import os 
import csv 
import pandas as pd 

# Lasagne is pre-release, so it's interface is changing. 
# Whenever there's a backwards-incompatible change, a warning is raised. 
# Let's ignore these for the course of the tutorial 
import warnings 
warnings.filterwarnings('ignore', module='lasagne') 

from lasagne.objectives import categorical_crossentropy, aggregate 

#load the data and prepare it 
df = pd.read_excel('risk_sample_data_9.20.16_anon.xls',skiprows=0) 
rawdata = df.values 
# remove empty rows (odd rows) 
mask = np.ones(len(rawdata), dtype=bool) 
mask[::2] = False 
data = rawdata[mask] 

idx = np.array([1,5,6,7]) 
m = np.zeros_like(data) 
m[:,idx] = 1 
X = np.ma.masked_array(data,m) 
X = np.ma.filled(X, fill_value=0) 

X = X.astype(theano.config.floatX) 
y = data[:,7] # extract financial rating labels 
# convert char lables into int , A=1 , B=2, C=3, D=4, F=5 
y[y == 'A'] = 1 
y[y == 'B'] = 2 
y[y == 'C'] = 3 
y[y == 'D'] = 4 
y[y == 'F'] = 5 
y = pd.to_numeric(y) 
y = y.astype('int32') 
#y = y.astype(theano.config.floatX) 
N_CLASSES = 5 

# First, construct an input layer. 
# The shape parameter defines the expected input shape, 
# which is just the shape of our data matrix data. 
l_in = lasagne.layers.InputLayer(shape=X.shape) 

# We'll create a network with two dense layers: 
# A tanh hidden layer and a softmax output layer. 
l_hidden1 = lasagne.layers.DenseLayer(
# The first argument is the input layer 
l_in, 
# This defines the layer's output dimensionality 
num_units=250, 
# Various nonlinearities are available 
nonlinearity=lasagne.nonlinearities.rectify) 

l_hidden2 = lasagne.layers.DenseLayer(
# The first argument is the input layer 
l_hidden1, 
# This defines the layer's output dimensionality 
num_units=100, 
# Various nonlinearities are available 
nonlinearity=lasagne.nonlinearities.rectify) 

l_hidden3 = lasagne.layers.DenseLayer(
# The first argument is the input layer 
l_hidden2, 
# This defines the layer's output dimensionality 
num_units=50, 
# Various nonlinearities are available 
nonlinearity=lasagne.nonlinearities.rectify) 

l_hidden4 = lasagne.layers.DenseLayer(
# The first argument is the input layer 
l_hidden3, 
# This defines the layer's output dimensionality 
num_units=10, 
# Various nonlinearities are available 
nonlinearity=lasagne.nonlinearities.sigmoid) 

# For our output layer, we'll use a dense layer with a softmax nonlinearity. 
l_output = lasagne.layers.DenseLayer(
l_hidden4, num_units=N_CLASSES, nonlinearity=lasagne.nonlinearities.softmax) 

net_output = lasagne.layers.get_output(l_output) 

# As a loss function, we'll use Theano's categorical_crossentropy function. 
# This allows for the network output to be class probabilities, 
# but the target output to be class labels. 
true_output = T.ivector('true_output') 

# get_loss computes a Theano expression for the objective, 
# given a target variable 
# By default, it will use the network's InputLayer input_var, 
# which is what we want. 
#loss = objective.get_loss(target=true_output) 

loss = lasagne.objectives.categorical_crossentropy(net_output, true_output) 
loss = aggregate(loss, mode='mean') 

# Retrieving all parameters of the network is done using get_all_params, 
# which recursively collects the parameters of all layers 
# connected to the provided layer. 
all_params = lasagne.layers.get_all_params(l_output) 

# Now, we'll generate updates using Lasagne's SGD function 
updates = lasagne.updates.sgd(loss, all_params, learning_rate=1) 

# Finally, we can compile Theano functions for training and 
# computing the output. 
# Note that because loss depends on the input variable of our input layer, 
# we need to retrieve it and tell Theano to use it. 
train = theano.function([l_in.input_var, true_output], loss,  updates=updates) 
get_output = theano.function([l_in.input_var], net_output) 

def eq(x, y): 
if x==y: 
    return 1 
return 0 

print("Training ...") 
# Train for 100 epochs 
for n in xrange(10): 
train(X, y) 
y_predicted = np.argmax(get_output(X), axis=1) 
correct = reduce(lambda a, b: a+b, map(eq, y_predicted, y)) 
print("Iteration {} correct prediction {}".format(n, correct)) 

# Compute the predicted label of the training data. 
# The argmax converts the class probability output to class label 
y_predicted = np.argmax(get_output(X), axis=1) 

print(y_predicted) 
+0

你吃过看看数据?你能从它得到非零的准确性吗?你看过网络的预测吗?它总是一样吗?没有培训的准确度是多少? –

回答

2

学习率似乎太高。首先尝试较低的学习率。这可能是你的模型在任务上分歧。如果不能在数据上进行尝试,很难说清楚。