Caffe中的图像分类总是返回相同的分类

我在caffe中的图像分类存在问题。我使用imagenet模型（来自caffe tutorial）对我创建的数据进行分类，但我总是得到相同的分类结果（相同的类，即类3）。这是我如何进行：Caffe中的图像分类总是返回相同的分类

我使用的窗口和Python CAFFE作为接口

（1）I收集的数据。我的示例图像（训练&测试）是尺寸为5x5x3（RGB）uint8的图像，因此其像素值从0-255达到。（2）我将它们调整为imagenet需要的大小：256x256x3。因此我使用matlab中的调整大小函数（最近邻居插值）。
（3）我创建了一个LevelDB和image_mean。（4）训练我的网络（3000次迭代）。我在imagenet定义唯一变化的参数是路径平均图像和LevelDBs.The结果我得到：

I0428 12:38:04.350100 3236 solver.cpp:245]  Train net output #0: loss = 1.91102 (* 1 = 1.91102 loss) 
I0428 12:38:04.350100 3236 sgd_solver.cpp:106] Iteration 2900, lr = 0.0001 
I0428 12:38:30.353361 3236 solver.cpp:229] Iteration 2920, loss = 2.18008 
I0428 12:38:30.353361 3236 solver.cpp:245]  Train net output #0: loss = 2.18008 (* 1 = 2.18008 loss) 
I0428 12:38:30.353361 3236 sgd_solver.cpp:106] Iteration 2920, lr = 0.0001 
I0428 12:38:56.351630 3236 solver.cpp:229] Iteration 2940, loss = 1.90925 
I0428 12:38:56.351630 3236 solver.cpp:245]  Train net output #0: loss = 1.90925 (* 1 = 1.90925 loss) 
I0428 12:38:56.351630 3236 sgd_solver.cpp:106] Iteration 2940, lr = 0.0001 
I0428 12:39:22.341891 3236 solver.cpp:229] Iteration 2960, loss = 1.98917 
I0428 12:39:22.341891 3236 solver.cpp:245]  Train net output #0: loss = 1.98917 (* 1 = 1.98917 loss) 
I0428 12:39:22.341891 3236 sgd_solver.cpp:106] Iteration 2960, lr = 0.0001 
I0428 12:39:48.334151 3236 solver.cpp:229] Iteration 2980, loss = 2.45919 
I0428 12:39:48.334151 3236 solver.cpp:245]  Train net output #0: loss = 2.45919 (* 1 = 2.45919 loss) 
I0428 12:39:48.334151 3236 sgd_solver.cpp:106] Iteration 2980, lr = 0.0001 
I0428 12:40:13.040398 3236 solver.cpp:456] Snapshotting to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel 
I0428 12:40:15.080418 3236 sgd_solver.cpp:273] Snapshotting solver state to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.solverstate 
I0428 12:40:15.820426 3236 solver.cpp:318] Iteration 3000, loss = 2.08741 
I0428 12:40:15.820426 3236 solver.cpp:338] Iteration 3000, Testing net (#0) 
I0428 12:41:50.398375 3236 solver.cpp:406]  Test net output #0: accuracy = 0.11914 
I0428 12:41:50.398375 3236 solver.cpp:406]  Test net output #1: loss = 2.71476 (* 1 = 2.71476 loss) 
I0428 12:41:50.398375 3236 solver.cpp:323] Optimization Done. 
I0428 12:41:50.398375 3236 caffe.cpp:222] Optimization Done.

（5）我运行下面的Python代码的单个图像进行分类：

# set up Python environment: numpy for numerical routines, and matplotlib for plotting 
import numpy as np 
import matplotlib.pyplot as plt 
# display plots in this notebook 


# set display defaults 
plt.rcParams['figure.figsize'] = (10, 10)  # large images 
plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show square pixels 
plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than a (potentially misleading) color heatmap 

# The caffe module needs to be on the Python path; 
# we'll add it here explicitly. 
import sys 
caffe_root = '../' # this file should be run from {caffe_root}/examples (otherwise change this line) 
sys.path.insert(0, caffe_root + 'python') 

import caffe 
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path. 


caffe.set_mode_cpu() 

model_def = 'C:/Caffe/caffe-windows-master/models/bvlc_reference_caffenet/deploy.prototxt' 
model_weights = 'Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel' 

net = caffe.Net(model_def,  # defines the structure of the model 
       model_weights, # contains the trained weights 
       caffe.TEST)  # use test mode (e.g., don't perform dropout) 

#load mean image file and convert it to a .npy file-------------------------------- 
blob = caffe.proto.caffe_pb2.BlobProto() 
data = open('Z:/DeepLearning/S1S2/Stockholm/S1S2train256.binaryproto',"rb").read() 
blob.ParseFromString(data) 
nparray = caffe.io.blobproto_to_array(blob) 
f = file('Z:/DeepLearning/PythonCalssification/imgmean.npy',"wb") 
np.save(f,nparray) 

f.close() 


# load the mean ImageNet image (as distributed with Caffe) for subtraction 
mu1 = np.load('Z:/DeepLearning/PythonCalssification/imgmean.npy') 
mu1 = mu1.squeeze() 
mu = mu1.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values 
print 'mean-subtracted values:', zip('BGR', mu) 
print 'mean shape: ',mu1.shape 
print 'data shape: ',net.blobs['data'].data.shape 

# create transformer for the input called 'data' 
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) 

# set the size of the input (we can skip this if we're happy 

transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension 
transformer.set_mean('data', mu)   # subtract the dataset-mean value in each channel 
transformer.set_raw_scale('data', 255)  # rescale from [0, 1] to [0, 255] 
transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR 

# set the size of the input (we can skip this if we're happy 
# with the default; we can also change it later, e.g., for different batch sizes) 
net.blobs['data'].reshape(50,  # batch size 
          3,   # 3-channel (BGR) images 
          227, 227) # image size is 227x227 

#load image 
image = caffe.io.load_image('Z:/DeepLearning/PythonCalssification/380.tiff') 
transformed_image = transformer.preprocess('data', image) 
#plt.imshow(image) 

# copy the image data into the memory allocated for the net 
net.blobs['data'].data[...] = transformed_image 

### perform classification 
output = net.forward() 

output_prob = output['prob'][0] # the output probability vector for the first image in the batch 

print 'predicted class is:', output_prob.argmax()

无论我使用哪个输入图像，我总是得到类“3”作为分类结果。这里是我训练/分类的示例图像：

如果有人有一个想法是什么错误，我会很高兴。提前致谢！

来源

2016-04-28 Mr M

您使用了多少数据？每班有多少班级和例子？ –

如果你总是得到相同的类，这意味着神经网络没有得到适当的训练。

确保训练集是平衡的。当分类器总是预测同一个类时，通常是因为一个类根据其他类代表过度表示。例如，假设您有两个类，第一个用95个实例表示，第二个用5表示。如果分类器将所有内容归类为第一类，那么他已经是95％了。
一个显而易见的问题是，你应该规范输入image/255.0 - 0.5，它会使输入居中并减小标准偏差。
之后，请确保您的训练集中至少有4倍数据用于NN中的权重。
最后但并非最不重要的一点是，确保训练集已正确洗牌。

来源

2016-04-28 21:03:04 FiReTiTi

我会尝试一步步完成您的建议： 1）我有8个班级。它们由以下样本大小表示：类1：918 类2：897 类3：922 类4：799 类5:69 类6：277 类7：718 第8类：691 –

2 ）就我而言，Imagenet需要使用图像/像素均值的图像归一化。因此，以下步骤在上面的Python代码中执行： transformer.set_transpose（'data'，（2,0,1））＃将图像通道移动到最外面的尺寸 transformer.set_mean（'data'，mu）＃减去数据集 - 每个通道的平均值变压器。set_raw_scale（'data'，255）＃rescale from [0，1] to [0,255] transformer.set_channel_swap（'data'，（2,1,0））＃从RGB切换通道到BGR。 –

在这个步骤中，图像平均值被减去，图像被缩放到0-255，通道被切换，因为它们以相反的顺序加载，最后，转置操作正在执行（我不是100％确定这是为什么需要虽然） –

Caffe中的图像分类总是返回相同的分类

回答

相关问题