需要使用numpy或sklearn对python中的数据集合进行主成分分析

我有一个'数据集合' df，数据如下。我正在尝试使用sklearn对数据集合进行主成分分析（PCA）。但我越来越Typeerror需要使用numpy或sklearn对python中的数据集合进行主成分分析

from sklearn.decomposition import PCA 
df # dataframe collection 
pca = PCA(n_components=5) 
pca.fit(X)

如何将数据帧集合转换为数组矩阵与序列。我想，如果我转换成数组矩阵，我将能够做到PCA

数据：

{'USSP2 CMPN Curncy': 
0  0.297453 
1  0.320505 
2  0.345978 
3  0.427871 
Name: (USSP2 CMPN Curncy, PX_LAST), Length: 1747, dtype: float64, 
'MARGDEBT Index': 
0  0.095478 
1  0.167469 
2  0.186317 
3  0.203729 
Name: (MARGDEBT Index, PX_LAST), Length: 79, dtype: float64, 
'SL% SMT% Index': 
0  0.163636 
1  0.000000 
2  0.000000 
3  0.363636 
Name: (SL% SMT% Index, PX_LAST), dtype: float64, 
'FFSRAIWS Index': 
0  0.157234 
1  0.278174 
2  0.530603 
3  0.526519 
Name: (FFSRAIWS Index, PX_LAST), dtype: float64, 
'USPHNSA Index': 
0  0.107330 
1  0.213351 
2  0.544503 
3  0.460733 
Name: (USPHNSA Index, PX_LAST), Length: 79, dtype: float64]

谁能帮助在PCA的数据帧的集合。谢谢！

来源

2017-09-13 Arvinth Kumar

您的数据帧集合是DataFrame对象的字典（dict）。

要执行分析，您需要有一组数据来处理。因此，第一步是将数据转换为单个DataFrame。熊猫本身支持从数据帧字典中的连接，例如，

import pandas as pd 

df = { 
    'Currency1': pd.DataFrame([[0.297453,0.5]]), 
    'Currency2': pd.DataFrame([[0.297453,0.5]]) 
}  

X = pd.concat(df)

现在，您可以在值从DataFrame，例如执行PCA

pca = PCA(n_components=5) 
pca.fit(X.values)

来源

2017-09-13 21:34:50 mfitzp

需要使用numpy或sklearn对python中的数据集合进行主成分分析

回答

相关问题