2016-06-25 39 views
0

这实际上是this question的重复。然而,我想根据我用初步的“手动”编码实验得到的感知器系数来提出一个关于决策边界线绘图的非常具体的问题。正如你可以看到一个很好的决策边界线从回归结果中提取的系数:基于感知器系数绘制分类决策边界线

enter image description here

基础上,glm()结果:在感知实验

(Intercept)  test1  test2 
    1.718449 4.012903 3.743903 

的系数是完全不同的:

 bias  test1  test2 
9.131054 19.095881 20.736352 

为了便于回答,here is the data,这里是代码:

# DATA PRE-PROCESSING: 
dat = read.csv("perceptron.txt", header=F) 
dat[,1:2] = apply(dat[,1:2], MARGIN = 2, FUN = function(x) scale(x)) # scaling the data 
data = data.frame(rep(1,nrow(dat)), dat) # introducing the "bias" column 
colnames(data) = c("bias","test1","test2","y") 
data$y[data$y==0] = -1 # Turning 0/1 dependent variable into -1/1. 
data = as.matrix(data) # Turning data.frame into matrix to avoid mmult problems. 

# PERCEPTRON: 
set.seed(62416) 
no.iter = 1000       # Number of loops 
theta = rnorm(ncol(data) - 1)   # Starting a random vector of coefficients. 
theta = theta/sqrt(sum(theta^2))   # Normalizing the vector. 
h = theta %*% t(data[,1:3])    # Performing the first f(theta^T X) 

for (i in 1:no.iter){     # We will recalculate 1,000 times 
    for (j in 1:nrow(data)){    # Each time we go through each example. 
     if(h[j] * data[j, 4] < 0){   # If the hypothesis disagrees with the sign of y, 
     theta = theta + (sign(data[j,4]) * data[j, 1:3]) # We + or - the example from theta. 
     } 
     else 
     theta = theta      # Else we let it be. 
    } 
    h = theta %*% t(data[,1:3])   # Calculating h() after iteration. 
} 
theta         # Final coefficients 
mean(sign(h) == data[,4])    # Accuracy 

问题:如何绘制边界线(像​​我一样用上面的回归系数)如果我们只有感知系数?

回答

0

嗯......事实证明,这是完全一样的case of logistic regression,尽管广泛不同的系数:挑横坐标(测试1)的最小值和最大值,加上以微弱的差距,并计算在决策边界对应的测试2点的值(当0 = theta_o + theta_1 test1 + theta_2 test2),并绘制点之间的线:

palette(c("tan3","purple4")) 
plot(test2 ~ test1, col = as.factor(y), pch = 20, data=data, 
    main="College admissions") 
(x = c(min(data[,2])-.2, max(data[,2])+ .2)) 
(y = c((-1/theta[3]) * (theta[2] * x + theta[1]))) 
lines(x, y, lwd=3, col=rgb(.7,0,.2,.5)) 

enter image description here

-1

感知权重被计算,使得当THETA^TX> 0,其归类为正面,当theta^TX < 0它被分类为负值。这意味着公式theta^T X是感知器的决策边界。

相同的逻辑适用于逻辑回归,除了它现在的sigmoid(theta^T X)> 0.5。

+0

S形函数的通常阈值为0.5,这对应于Theta^T X的零交叉。 – Toni

+0

您是对的。我忘记sigmoid从0到1. – Alex

+0

我认为我的问题是非常具体的,我留下了答案,而不是删除整个帖子,以防其他人可以从中受益(在这段时间内,我已经看到类似的问题)。 – Toni