大熊猫index.asof与多指标

我有每个实体的时间序列数据：大熊猫index.asof与多指标

id event_date value 
1 2013-12-21 3.82 
1 2013-12-22 2.47 
1 2013-12-25 2.13 
1 2014-01-03 3.92 
1 2014-01-04 2.48 
2 2014-10-16 3.96 
2 2014-10-17 3.61 
2 2014-10-29 2.59 
2 2014-11-05 3.64 
2 2014-11-15 2.85

我已经把它放在一个数据帧具有多指标：

   value 
id event_date 
1 2013-12-21 3.82 
    2013-12-22 2.47 
    2013-12-25 2.13 
    2014-01-03 3.92 
    2014-01-04 2.48 
2 2014-10-16 3.96 
    2014-10-17 3.61 
    2014-10-29 2.59 
    2014-11-05 3.64 
    2014-11-15 2.85

我想在每个id的系列中找到任意中断前的最新日期（比如2014-10-31或2014-09-30之前）。 index.asof或Series.asof似乎是我想要的，但我无法弄清楚如何在多个索引中使用它。对于一个日期“2014-10-30”我想这样的输出：

id event_date 
1 2014-01-04 00:00:00 
2 2014-10-29 00:00:00

我可以遍历第一级索引那里，但它似乎应该有一个更好的更pandonic方式（完整数据设置非常大），我只是想念它。

In [10]: for idx in df.index.levels[0]: 
    ....:  print idx, df.loc[idx].index.asof('2014-10-30') 
    ....: 
1 2014-01-04 00:00:00 
2 2014-10-29 00:00:00

没有理由的数据必须是在这个多指标结构，只是似乎很有道理给予我会为每个ID的时间序列。时间排序，没有重复。

版本：大熊猫：0.15.0 numpy的：1.9.0

来源

2014-11-21 Patrick Russell

它看起来对我说，@ gjreda的回答只是缺少你截止滤光片，所以假设event_date和id是不在索引中：

cutoff = '2014-10-30' 
df[df['event_date'] <= cutoff].groupby(['id'])['event_date'].last()

这给了相同的输出之前，但截止是任意的：

id 
1 2014-01-04 
2 2014-10-29 
Name: event_date, dtype: datetime64[ns]

如果你仍然想在索引中使用这些列，你可以这样做：

df[df.index.levels[1] <= cutoff].groupby(level=['id']).apply(lambda x: x.index.get_level_values(1).max())

顺便说一句，.asof似乎应用于groupby数据帧评估整个索引，而不是该组的索引，因此您的asof版本不能按预期工作：

df[df.index.levels[1] <= cutoff].groupby(level=[0]).apply(lambda x: x.index.levels[1].asof(cutoff))

id 
1 2014-10-29 
2 2014-10-29 
dtype: datetime64[ns]

它看起来似乎可以用于所有组的最后一个真正的价值。

来源

2014-11-21 20:54:33 Primer

如果没有理由为它是一个多指标，你可以做这样的事情：

In [10]: df.reset_index(inplace=True) 
In [11]: df.groupby('id')['event_date'].max() 
Out[11]: 
id 
1  2014-01-04 
2  2014-11-15 
Name: event_date, dtype: object

`` `

来源

2014-11-21 19:01:07

谢谢。我的问题并不完全清楚。我试图在任意中断之前找到最新的日期。因此，例如，10月份的最近日期结束。编辑这个问题来澄清。 – 2014-11-21 19:43:24

哎呀，错过了，但@ Primer的帮助你。 – 2014-11-22 19:56:50

大熊猫index.asof与多指标

回答

相关问题