2015-10-09 69 views
3

我有以下数据的数据帧:子集大熊猫据帧一切长达日期

  ACCOCI_ARQ ASSETSAVG_ART ASSETSC_ARQ ASSETSNC_ARQ ASSETS_ARQ 
Date                   
2004-02-10 -31000000  6647000000 6029000000  942000000 6971000000 
2004-03-27   NaN   NaN   NaN   NaN   NaN 
2004-05-06 -10000000  6740500000 5784000000  951000000 6735000000 
2004-06-26   NaN   NaN   NaN   NaN   NaN 
2004-08-05 -18000000  6936000000 6286000000  937000000 7223000000 

我在给大熊猫一个日期拥有Timestamp对象。这个日期可能也可能不在数据框内。你如何继续并将数据集划分子集以使所有事情都达到特定日期? (还能怎么办,你这样了,但不包括该日?

我尝试了整个排序不同的方式像.ix.iloc等,但没能得到一个工作。

回答

2

您可以使用。用实例.loc布尔索引采取高达特定的时间戳(不包括时间戳) -

df.loc[df.index < ts] #ts is the timestamp 

如果要包括时间戳,你可以做 -

df.loc[df.index <= ts] 

演示 -

In [14]: df 
Out[14]: 
     Date ACCOCI_ARQ ASSETSAVG_ART ASSETSC_ARQ ASSETSNC_ARQ ASSETS_ARQ 
1 2004-02-10 -31000000  6647000000 6029000000  942000000 6971000000 
2 2004-03-27   NaN   NaN   NaN   NaN   NaN 
3 2004-05-06 -10000000  6740500000 5784000000  951000000 6735000000 
4 2004-06-26   NaN   NaN   NaN   NaN   NaN 
5 2004-08-05 -18000000  6936000000 6286000000  937000000 7223000000 

In [15]: ts = pd.to_datetime('2004-05-06') 

In [19]: df = df.set_index('Date') 

In [20]: df 
Out[20]: 
      ACCOCI_ARQ ASSETSAVG_ART ASSETSC_ARQ ASSETSNC_ARQ ASSETS_ARQ 
Date 
2004-02-10 -31000000  6647000000 6029000000  942000000 6971000000 
2004-03-27   NaN   NaN   NaN   NaN   NaN 
2004-05-06 -10000000  6740500000 5784000000  951000000 6735000000 
2004-06-26   NaN   NaN   NaN   NaN   NaN 
2004-08-05 -18000000  6936000000 6286000000  937000000 7223000000 

In [21]: df.loc[df.index < ts] 
Out[21]: 
      ACCOCI_ARQ ASSETSAVG_ART ASSETSC_ARQ ASSETSNC_ARQ ASSETS_ARQ 
Date 
2004-02-10 -31000000  6647000000 6029000000  942000000 6971000000 
2004-03-27   NaN   NaN   NaN   NaN   NaN 

In [22]: df.loc[df.index <= ts] 
Out[22]: 
      ACCOCI_ARQ ASSETSAVG_ART ASSETSC_ARQ ASSETSNC_ARQ ASSETS_ARQ 
Date 
2004-02-10 -31000000  6647000000 6029000000  942000000 6971000000 
2004-03-27   NaN   NaN   NaN   NaN   NaN 
2004-05-06 -10000000  6740500000 5784000000  951000000 6735000000 
+0

当我尝试做'DF [ '日期']',我得到'KeyError异常:“Date''。请注意,它的时间系列和'日期'不是一列。 – user1234440

+0

'日期'就像一边,整个df基于 – user1234440

+0

索引,非常感谢,它现在的作品! – user1234440