PCA上的组件数量受样本数量的限制

我使用sklearn做PCA，当我有更多的样本比我想要使用它的组件数量更多的样本时，我正在测试一些虚拟数据的功能。蛮好的：PCA上的组件数量受样本数量的限制

from sklearn.decomposition import PCA 
import numpy as np  

features_training = np.random.rand(10,30) 
components = 8 
pca = PCA(n_components=int(components)) 
X_pca = pca.fit_transform(features_training)

从上面的代码中我得到一个10 * 8的矩阵。

X_pca.shape 
(10, 8)

但对于同样的数据，如果我尽量保持15个组件：

features_training = np.random.rand(10,30) 
components = 15 
pca = PCA(n_components=int(components)) 
X_pca = pca.fit_transform(features_training)

我没有得到一个10 * 15矩阵而是一个10 * 10的。

X_pca.shape 
(10, 10)

所以看起来，组件的数量不仅受特征数量的限制，而且受限于样本数量。这是为什么？

来源

2017-02-04 Luis Ramon Ramirez Rodriguez

我不能告诉你PCA是如何工作的。但是在Scikit-learn documentation for PCA中，提到了actual n_components = min(n_samples, specified n_components)

来源

2017-02-04 06:17:49

PCA上的组件数量受样本数量的限制

回答

相关问题