2016-12-06 155 views
0

问题:我需要训练一个分类器(在matlab中)来分类多个信号噪声水平。Sklearn支持向量机与Matlab SVM

所以我在matlab中使用fitcecoc训练了一个多类SVM并获得了92%的精度。

然后,我在python中使用sklearn.svm.svc训练了多类SVM,但似乎是我摆弄了参数,我无法达到超过69%的准确度。

30%的数据被阻止并用于验证培训。混淆矩阵可以在下面看到。

Matlab confusion matrix

Python confusion matrix

因此,如果任何人有一定的经验或建议与svm.svc多类培训,并可以在我的代码中看到一个问题,或者有什么建议,将不胜感激。

Python代码:

import numpy as np 
from sklearn import svm 
from sklearn.model_selection import cross_val_score 
from sklearn.model_selection import train_test_split 
#from sklearn import preprocessing 

#### SET fitting parameters here 
C = 100 
gamma = 1e-8 

#### SET WEIGHTS HERE 
C0_Weight = 1*C 
C1_weight = 1*C 
C2_weight = 1*C 
C3_weight = 1*C 
C4_weight = 1*C 
##### 


X = np.genfromtxt('data/features.csv', delimiter=',') 
Y = np.genfromtxt('data/targets.csv', delimiter=',') 

print 'feature data is of size: ' + str(X.shape) 
print 'target data is of size: ' + str(Y.shape) 

# SPLIT X AND Y INTO TRAINING AND TEST SET 
test_size = 0.3 
X_train, x_test, Y_train, y_test = train_test_split(X, Y,   
... test_size=test_size, random_state=0) 

svc = svm.SVC(C=C,kernel='rbf', gamma=gamma, class_weight = {0:C0_Weight, 
... 1:C1_weight, 2:C2_weight, 3:C3_weight, 4:C4_weight},cache_size = 1000) 

svc.fit(X_train, Y_train) 
scores = cross_val_score(svc, X_train, Y_train, cv=10) 
print scores 
print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2)) 

Out = svc.predict(x_test) 

np.savetxt("data/testPredictions.csv", Out, delimiter=",") 
np.savetxt("data/testTargets.csv", y_test, delimiter=",") 

# calculate accuracy in test data 
Hits = 0 
HitsOverlap = 0 
for idx, val in enumerate(Out): 
    Hits += int(y_test[idx]==Out[idx]) 
    HitsOverlap += int(y_test[idx]==Out[idx]) + int(y_test[idx]== 
    ... (Out[idx]-1)) + int(y_test[idx]==(Out[idx]+1)) 

print "Accuracy in testset: ", Hits*100/(11595*test_size) 
print "Accuracy in testset w. overlap: ", HitsOverlap*100/(11595*test_size) 

那些好奇我是怎么得到的参数,他们被发现与GridSearchCV(并增加了精确度从40%〜69)

任何帮助或建议非常赞赏。

回答