2016-07-28 7 views
1

为了使我的问题具有可重现性,我使用虹膜花数据集(10个任意行,所有列标准归一化)和一个最小神经网络模型(它们生成了以下.csv文件通过修改我在互联网上找到的MNIST示例,使用萼片长度,萼片宽度和花瓣长度预测花瓣宽度)。向下滚动查看我的问题!TensorFlow:将Spearman距离作为目标函数

iris.csv

"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species" 
0.0551224773430978,-0.380319414627833,-0.335895230408602,-0.548226210538025,"versicolor" 
1.48830688826362,-1.01418510567422,1.37931445678426,0.614677872421422,"virginica" 
0.606347250774068,0.887411967464943,0.450242542888127,0.780807027129915,"virginica" 
-0.606347250774067,-1.64805079672061,0.235841331989019,0.44854871771293,"virginica" 
1.15757202420504,-1.01418510567422,0.950512034986045,0.44854871771293,"virginica" 
-1.92928670700839,0.887411967464943,-2.33697319880027,-2.37564691233144,"setosa" 
0.38585734140168,0.253546276418555,0.307308402288722,1.1130653365469,"virginica" 
-0.826837160146455,0.253546276418555,-0.478829371008007,-0.548226210538025,"versicolor" 
0.0551224773430978,1.52127765851133,-0.192961089809197,-0.21596790112104,"versicolor" 
-0.385857341401679,0.253546276418555,0.021440121089911,0.282419563004437,"virginica" 

nn.py

import pandas as pd 
import numpy as np 
import tensorflow as tf 
import scipy.stats 

# Import iris data 
data = pd.read_csv("iris.csv") 
input = data[["Sepal.Length", "Sepal.Width", "Petal.Length"]] 
target = data[["Petal.Width"]] 

# Parameters 
learning_rate = 0.001 
training_epochs = 6000 

# Network Parameters 
n_hidden_1 = 5 # 1st layer number of features 
n_hidden_2 = 5 # 2nd layer number of features 
n_input = 3 # data input 
n_output = 1 # data output 

# tf Graph input 
x = tf.placeholder("float", [None, n_input]) 
y = tf.placeholder("float", [None, n_output]) 

# Create model 
def multilayer_network(x, weights, biases): 
    # Hidden layer with TanH activation 
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1']) 
    layer_1 = tf.tanh(layer_1) 
    # Hidden layer with TanH activation 
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']) 
    layer_2 = tf.tanh(layer_2) 
    # Output layer with linear activation 
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out'] 
    return out_layer 

# Store layers weight & bias 
weights = { 
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), 
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), 
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_output])) 
} 
biases = { 
    'b1': tf.Variable(tf.random_normal([n_hidden_1])), 
    'b2': tf.Variable(tf.random_normal([n_hidden_2])), 
    'out': tf.Variable(tf.random_normal([n_output])) 
} 

# Construct model 
pred = multilayer_network(x, weights, biases) 

# Define loss and optimizer 
cost = tf.reduce_mean(tf.square(pred-y)) 
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) 

# Initializing the variables 
init = tf.initialize_all_variables() 

# Launch the graph 
with tf.Session() as sess: 
    sess.run(init) 

    # Training cycle 
    for epoch in range(training_epochs): 
    # Run optimization op (backprop) and cost op (to get loss value) 
    _, c = sess.run([optimizer, cost], feed_dict={x: input, y: target}) 

    # Display logs per epoch step 
    if epoch % 1000 == 0: 
     print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c) 

print "Optimization Finished!" 

下面是一个例子训练会话结果:

$ python nn.py 
Epoch: 0001 cost= 3.000185966 
Epoch: 1001 cost= 0.031734336 
Epoch: 2001 cost= 0.000614795 
Epoch: 3001 cost= 0.000008422 
Epoch: 4001 cost= 0.000000057 
Epoch: 5001 cost= 0.000000000 
Optimization Finished! 

我的想法是用我最近了解到的Spearman距离代替均方误差作为我的目标函数。以下定义:

FORMULA

我写了一个返回向量的排序函数:

import scipy.stats 

def rank(vector): 
    return scipy.stats.rankdata(vector, method="min") 

使用TensorFlow的方法py_func,我定义我的成本张量如下。

pred = tf.to_float(tf.py_func(rank, [pred], [tf.int64])[0]) 
y = tf.to_float(tf.py_func(rank, [y], [tf.int64])[0]) 

cost = tf.reduce_mean(tf.square(y-pred)) 

然而,这给我的错误

ValueError: No gradients provided for any variable: ((None, <tensorflow.python.ops.variables.Variable object at 0x7f67ffe4ee90>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3c4990>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357310>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357190>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed380350>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3801d0>)) 

我不明白,根本的问题是什么。任何你可以提供给我的方向将不胜感激!

+0

由于这看起来像是一个实现问题,我认为你将在StackOverflow上获得更好的运气。我认为找到一条贯穿你的损失函数的“路径”是很困难的。 –

+0

看起来你正在使用'tf.to_float'将tensorflow变量中的'y'和'pred'转换为浮点数,此时它们不再是tensorflow对象。成本也只是一个数值。但是,即使你解决了这个问题,等级转换也是不可区分的,所以你仍然会遇到试图用任何基于梯度的方法进行训练的问题。 – user20160

回答

2

您的错误来自tf.py_func没有定义梯度的事实。

无论如何,正如@ user20160在评论中所说,即使存在操作rank也没有渐变,所以这不是一个可以直接训练算法的损失。

+0

我明白了。然而,我已经读过,“损失函数的常见选择是[...],以及平方距离的总和。”在一本关于机器学习的书中。这怎么可能呢? –

+1

你可以用它作为损失看看你的模型有多糟糕/好,但你不能训练它(因为没有梯度)。有很多方法可以通过强化学习的方式进行非微分损失训练,但这非常复杂。 –

+0

这很有道理!如果你知道的话,你能否提供一些关于强化学习的自学的资源? –