2014-01-21 38 views
3

我有一个熊猫数据框(dfnew),其中一列(时间戳)为datetime64[ns]类型。现在我想看看在特定的时间范围内有多少观测值可以说从10:00:00到12:00:00。选择在特定时间范围内观察datetime64 [ns]类型

dfnew['timestamp'] = dfnew['timestamp'].astype('datetime64[ns]') 
    dfnew['timestamp] 
0 2013-12-19 09:03:21.223000 
1 2013-12-19 11:34:23.037000 
2 2013-12-19 11:34:23.050000 
3 2013-12-19 11:34:23.067000 
4 2013-12-19 11:34:23.067000 
5 2013-12-19 11:34:23.067000 
6 2013-12-19 11:34:23.067000 
7 2013-12-19 11:34:23.067000 
8 2013-12-19 11:34:23.067000 
9 2013-12-19 11:34:23.080000 
10 2013-12-19 11:34:23.080000 
11 2013-12-19 11:34:23.080000 
12 2013-12-19 11:34:23.080000 
13 2013-12-19 11:34:23.080000 
14 2013-12-19 11:34:23.080000 
15 2013-12-19 11:34:23.097000 
16 2013-12-19 11:34:23.097000 
17 2013-12-19 11:34:23.097000 
18 2013-12-19 11:34:23.097000 
19 2013-12-19 11:34:23.097000 
Name: timestamp 

    dfnew['Time']=dfnew['timestamp'].map(Timestamp.time) 
    t1 = datetime.time(10, 0, 0) 
    t2 = datetime.time(12, 0, 0) 
    print len(dfnew[t1<dfnew["Time"]<t2]) 

这产生一个错误类型错误:无法datetime.time比较系列。 我是熊猫数据框的新手。我想我在这里犯了一个非常愚蠢的错误。任何帮助表示赞赏。

回答

2

可以使用DatetimeIndex indexer_between_time方法,所以在这里一招利用它的系列/列传递给DatetimeIndex构造:

from datetime import time 

# s is your datetime64 column 

In [11]: pd.DatetimeIndex(s).indexer_between_time(time(10), time(12)) 
Out[11]: 
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) 

这得到10之间的时间位置12个(包括*),所以使用ILOC进行过滤:

In [12]: s.iloc[pd.DatetimeIndex(s).indexer_between_time(time(10), time(12))] 
Out[12]: 
1 2013-12-19 11:34:23.037000 
2 2013-12-19 11:34:23.050000 
3 2013-12-19 11:34:23.067000 
4 2013-12-19 11:34:23.067000 
5 2013-12-19 11:34:23.067000 
6 2013-12-19 11:34:23.067000 
7 2013-12-19 11:34:23.067000 
8 2013-12-19 11:34:23.067000 
9 2013-12-19 11:34:23.080000 
10 2013-12-19 11:34:23.080000 
11 2013-12-19 11:34:23.080000 
12 2013-12-19 11:34:23.080000 
13 2013-12-19 11:34:23.080000 
14 2013-12-19 11:34:23.080000 
15 2013-12-19 11:34:23.097000 
16 2013-12-19 11:34:23.097000 
17 2013-12-19 11:34:23.097000 
18 2013-12-19 11:34:23.097000 
19 2013-12-19 11:34:23.097000 
Name: timestamp, dtype: datetime64[ns] 

* include_startinclude_endindexer_between_time可选的布尔参数。

+0

谢谢工作好:) – sau