2011-05-19 40 views
1

是否有任何可用的示例给出了数据集上主要组件分析的实例?我正在阅读仅讨论理论的文章,​​并且正在寻找能够告诉我如何使用PCA,然后解释结果并将原始数据集转换为新数据集的内容。有什么建议吗?主成分分析的工作示例?

回答

3

如果你知道的Python,这里是一个简短的动手例如:

# Generate correlated data from uncorrelated data. 
# Each column of X is a 3-dimensional feature vector. 
Z = scipy.randn(3, 1000) 
C = scipy.randn(3, 3) 
X = scipy.dot(C, Z) 

# Visualize the correlation among the features. 
pylab.scatter(X[0,:], X[1,:]) 
pylab.scatter(X[0,:], X[2,:]) 
pylab.scatter(X[1,:], X[2,:]) 

# Perform PCA. It can be shown that the principal components of the 
# matrix X are equivalent to the left singular vectors of X, which are 
# equivalent to the eigenvectors of X X^T (up to indeterminacy in sign). 
U, S, Vh = scipy.linalg.svd(X) 
W, Q = scipy.linalg.eig(scipy.dot(X, X.T)) 
print U 
print Q 

# Project the original features onto the eigenspace. 
Y = scipy.dot(U.T, X) 

# Visualize the absence of correlation among the projected features. 
pylab.scatter(Y[0,:], Y[1,:]) 
pylab.scatter(Y[1,:], Y[2,:]) 
pylab.scatter(Y[0,:], Y[2,:]) 
0

由于您要求提供实际操作示例,因此您可以使用一个交互式演示。