2017-02-28 46 views
1

我有以下数据:如何更改数据框中保存的数据格式?

DF1

0  (AG, AD, AE) 
1  (AG, AM, AF) 
dtype: object 

DF2

0 [99.0, 45.0, 99.0, 92.0, 140.0, 53.0, 185.0, 8... 
1 [78.0, 52.0, 74.0, 29.0, 30.0, 57.0, 48.0, 39.... 

DF3

0 [19.0, 22.0, 13.0, 24.0, 70.0, 50.0, 185.0, 8... 
1 [18.0, 33.0, 74.0, 29.0, 30.0, 77.0, 48.0, 39.... 

我想将这些系列保存为数据框。如果我做df = pd.DataFrame({"TYPE-1":df1,"TYPE-2":df2,"TYPE-2":df2}),然后我得到这样的:

TYPE-1  TYPE-2       TYPE-3 
(AG, AD, AE) [99.0, 45.0, 99.0, 92.0,...] [78.0, 52.0, 74.0, 29.0, ...] 
(AG, AM, AF) [78.0, 52.0, 74.0, 29.0,...] [18.0, 33.0, 74.0, 29.0,...] 

如何更改格式,以这一个?:

TYPE-1  TYPE-2   TYPE-3 
(AG, AD, AE) 99.0   78.0 
(AG, AD, AE) 45.0   52.0 
... 

回答

1

需要numpy.repeat通过chain.from_iterable与另一列扁平化创造新的复制列:

from itertools import chain 
#sample from another solution 
df1 = pd.DataFrame(dict(tups = [('A', 'B'), ('C', 'D')])) 
df2 = pd.DataFrame(dict(lsts=[[1, 2, 3, 4], [5, 6, 7, 8]])) 
df3 = pd.DataFrame(dict(lsts=[[9, 10, 11, 12], [14, 15, 6, 4]])) 


df2 = pd.DataFrame({ 
     "a": np.repeat(df1.tups.values, df2.lsts.str.len()), 
     "b": list(chain.from_iterable(df2.lsts)), 
     "c": list(chain.from_iterable(df3.lsts))}) 

print (df2) 

     a b c 
0 (A, B) 1 9 
1 (A, B) 2 10 
2 (A, B) 3 11 
3 (A, B) 4 12 
4 (C, D) 5 14 
5 (C, D) 6 15 
6 (C, D) 7 6 
7 (C, D) 8 4 
+0

啊哎呀我错过了与第一列的细节,这是更好。 – miradulo

+0

是的,不幸的是:( – jezrael

相关问题