Scikit学习多个目标

我留下这个例子创建一个分类器图像scikit学习：http://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html 尽管每个图像属于一个类别的一切工作，但每个图像可能属于几个类别，如：照片与白天狗，晚上猫的图片，猫的照片和狗在夜间等... 我写道：Scikit学习多个目标

target=[[0,1],[0,2],[1,2],[0,2,3]] 
target = MultiLabelBinarizer().fit_transform(target) 

classifier = svm.SVC(gamma=0.001) 
classifier.fit(data, target)

，但我得到这个错误：

Traceback (most recent call last): 
    File "test.py", line 49, in <module> 
    classifier.fit(data, target) 
    File "/home/mezzo/.local/lib/python2.7/site-packages/sklearn/svm/base.py", line 151, in fit 
    y = self._validate_targets(y) 
    File "/home/mezzo/.local/lib/python2.7/site-packages/sklearn/svm/base.py", line 514, in _validate_targets 
    y_ = column_or_1d(y, warn=True) 
    File "/home/mezzo/.local/lib/python2.7/site-packages/sklearn/utils/validation.py", line 551, in column_or_1d 
    raise ValueError("bad input shape {0}".format(shape)) 
ValueError: bad input shape (4, 4)

完整代码

import numpy as np 
import PIL 
from PIL import Image 
import matplotlib.image as mpimg 

# The digits dataset 
digits = datasets.load_digits() 

def normalize(old_im): 
    base = 400 

    if (old_im.size[0] > old_im.size[1]): 
     wpercent = (base/float(old_im.size[0])) 
     hsize = int((float(old_im.size[1])*float(wpercent))) 
     old_im = old_im.resize((base,hsize), PIL.Image.ANTIALIAS) 
    else: 
     wpercent = (base/float(old_im.size[1])) 
     wsize = int((float(old_im.size[0])*float(wpercent))) 
     old_im = old_im.resize((wsize, base), PIL.Image.ANTIALIAS) 

    old_size = old_im.size 

    new_size = (base, base) 
    new_im = Image.new("RGB", new_size) 
    new_im.paste(old_im, ((new_size[0]-old_size[0])/2, 
          (new_size[1]-old_size[1])/2)) 

    #new_im.show() 
    new_im.save('prov.jpg') 
    return mpimg.imread('prov.jpg') 

# To apply a classifier on this data, we need to flatten the image, to 
# turn the data in a (samples, feature) matrix: 
imgs = np.array([normalize(Image.open('/home/mezzo/Immagini/1.jpg')),normalize(Image.open('/home/mezzo/Immagini/2.jpg')),normalize(Image.open('/home/mezzo/Immagini/3.jpg')),normalize(Image.open('/home/mezzo/Immagini/4.jpg'))]) 
n_samples = len(imgs) 
data = imgs.reshape((n_samples, -1)) 

target=[[0,1],[0,2],[1,2],[0,2,3]] 
target = MultiLabelBinarizer().fit_transform(target) 

# Create a classifier: a support vector classifier 
classifier = svm.SVC(gamma=0.001) 

# We learn the digits on the first half of the digits 
classifier.fit(data, target) 

# Now predict the value of the digit on the second half: 
predicted = classifier.predict(data) 

print("Classification report for classifier %s:\n%s\n" 
     % (classifier, metrics.classification_report(target, predicted))) 
print("Confusion matrix:\n%s" % metrics.confusion_matrix(target, predicted))

来源

2016-02-23 michelle.70

Scikit学习的SVM实现本身并不支持多标签分类，although it has various other classifiers that do：

Support multilabel: Decision Trees , Random Forests , Nearest Neighbors , Ridge Regression .

它也可以做多标记分类与SVM通过处理标签的每个唯一组合为一个单独的类。你可以简单地用一个整数标签替换每个独特的排在target矩阵，which can be done efficiently using np.unique：

d = np.dtype((np.void, target.dtype.itemsize * target.shape[1])) 
_, ulabels = np.unique(np.ascontiguousarray(target).view(d), return_inverse=True)

然后，你可以训练SVM，你会为一个单标签分类问题：

clf = svm.SVC() 
clf.fit(data, ulabels)

一潜在的警告是，如果您没有大量的训练实例，那么您的分类器的性能可能会很差，因为罕见的标签组合很差。

来源

2016-02-23 21:14:18

对不起，对于最近的答复，但我得到这个错误，你通过我的代码：AttributeError：'list'对象没有属性'dtype' –

谢谢你的答案，现在我看到：IndexError：tuple index out范围 –

使用此行创建文件.py，您可以复制该错误： target = [[0,1]，[0,2]，[1,2]，[0,2,3]] target = np.array（target） d = np.dtype（（np.void，target.dtype.itemsize * target.shape [1]）） _，ulabels = np.unique（np.ascontiguousarray（target）.view（ d），return_inverse = True） –

这是因为你的目标是：

array([[1, 1, 0, 0], 
     [1, 0, 1, 0], 
     [0, 1, 1, 0], 
     [1, 0, 1, 1]])

你的目标必须是形状（M），其中M是实例数。一个对付这种方式是你的二进制字节数组转换为标签，这样的：

for item in target: 
    print(sum(1<<i for i, b in enumerate(item) if b))

这样做的结果应该是：

现在你可以使用[3,5,6,13]作为你的目标。

来源

2016-02-23 21:21:43 Farseer

您可以为每个可能类别的子集创建一个新标签，就像在示例中一样。 – Farseer

Scikit学习多个目标

回答

相关问题