2016-05-25 58 views
1

创建一个数据帧我有大熊猫时间戳的numpy的数组:大熊猫无法从numpy的数组的时间戳

array([[Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T')], 
     [Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T')], 
     [Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'), 
     Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T')]], dtype=object) 

我不能创建此数组一个数据帧,作为试图这样做将引发以下错误:

AssertionError: Number of Block dimensions (1) must equal number of axes (2) 

你可以看到数组显然是二维的,我使用ndim进行了验证。

为什么我无法创建DataFrame?

回答

1

我认为你可以使用list理解:

import pandas as pd 
import numpy as np 

a =np.array([[pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T')], 
     [pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T')], 
     [pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'), 
     pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T')]], dtype=object) 

df = pd.DataFrame([x for x in a], columns=['a','b','c']) 
print (df) 
          a       b \ 
0 2016-05-02 15:50:00+00:00 2016-05-02 15:50:00+00:00 
1 2016-05-02 17:10:00+00:00 2016-05-02 17:10:00+00:00 
2 2016-05-02 20:25:00+00:00 2016-05-02 20:25:00+00:00 

          c 
0 2016-05-02 15:50:00+00:00 
1 2016-05-02 17:10:00+00:00 
2 2016-05-02 20:25:00+00:00 

另一种解决方案是DataFrame.from_records

print (pd.DataFrame.from_records(a, columns=['a','b','c'])) 
          a       b \ 
0 2016-05-02 15:50:00+00:00 2016-05-02 15:50:00+00:00 
1 2016-05-02 17:10:00+00:00 2016-05-02 17:10:00+00:00 
2 2016-05-02 20:25:00+00:00 2016-05-02 20:25:00+00:00 

          c 
0 2016-05-02 15:50:00+00:00 
1 2016-05-02 17:10:00+00:00 
2 2016-05-02 20:25:00+00:00 

alternate constructors of df

+0

这肯定回答我的问题,所以谢谢。我在问,因为我遇到的最初问题是我试图转置时间戳的DataFrame。即使使用'from_records'构造DataFrame,转置也会像以前一样抛出相同的'AssertionError'。 现在,我在构建DataFrame之前转置numpy数组。 –