2015-10-26 24 views
0

我有一个数据帧,看起来像这样添加列:大熊猫 - 基于多部分逻辑核查的日期时间

Num   First_Date  Last_Date 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008526 7/3/2013 0:00 7/18/2013 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008534 3/25/2014 0:00 5/5/2014 0:00 
20008636 7/15/2015 0:00 8/18/2015 0:00 
20008636 7/15/2015 0:00 8/18/2015 0:00 
20008636 7/15/2015 0:00 8/18/2015 0:00 

基本上,我想看看这两个日期是我指定的时间内。

period_beg = datetime.datetime(2015, 7, 1, 0, 0) 
period_end = datetime.datetime(2015, 9, 30, 0, 0) 

这是我要去的地方,但这看起来很疯狂和令人费解......哦,它不工作!大声笑。

df['TimeCheck'] = df[(df['First_Date'] >= period_beg) and (df['Last_Date'] <= period_end)] 

这里是我期待获得:

Num   First_Date  Last_Date  TimeCheck 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008526 7/3/2013 0:00 7/18/2013 0:00 TRUE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008534 3/25/2014 0:00 5/5/2014 0:00 FALSE 
20008636 7/15/2015 0:00 8/18/2015 0:00 TRUE 
20008636 7/15/2015 0:00 8/18/2015 0:00 TRUE 
20008636 7/15/2015 0:00 8/18/2015 0:00 TRUE 

上应如何做这样的想法?另外 - 我应该这样做这种方式(我不知道它的术语)还是应该遍历行并逐个添加值?

回答

1

你需要按位&而不是and使用:

In [7]: 
period_beg = dt.datetime(2015, 7, 1, 0, 0) 
period_end = dt.datetime(2015, 9, 30, 0, 0) 
df['TimeCheck'] = (df['First_Date'] >= period_beg) & (df['Last_Date'] <= period_end) 
df 

Out[7]: 
     Num First_Date Last_Date TimeCheck 
0 20008526 2013-07-03 2013-07-18  False 
1 20008526 2013-07-03 2013-07-18  False 
2 20008526 2013-07-03 2013-07-18  False 
3 20008526 2013-07-03 2013-07-18  False 
4 20008526 2013-07-03 2013-07-18  False 
5 20008526 2013-07-03 2013-07-18  False 
6 20008526 2013-07-03 2013-07-18  False 
7 20008526 2013-07-03 2013-07-18  False 
8 20008526 2013-07-03 2013-07-18  False 
9 20008526 2013-07-03 2013-07-18  False 
10 20008526 2013-07-03 2013-07-18  False 
11 20008526 2013-07-03 2013-07-18  False 
12 20008526 2013-07-03 2013-07-18  False 
13 20008534 2014-03-25 2014-05-05  False 
14 20008534 2014-03-25 2014-05-05  False 
15 20008534 2014-03-25 2014-05-05  False 
16 20008534 2014-03-25 2014-05-05  False 
17 20008534 2014-03-25 2014-05-05  False 
18 20008534 2014-03-25 2014-05-05  False 
19 20008534 2014-03-25 2014-05-05  False 
20 20008534 2014-03-25 2014-05-05  False 
21 20008534 2014-03-25 2014-05-05  False 
22 20008534 2014-03-25 2014-05-05  False 
23 20008534 2014-03-25 2014-05-05  False 
24 20008534 2014-03-25 2014-05-05  False 
25 20008534 2014-03-25 2014-05-05  False 
26 20008636 2015-07-15 2015-08-18  True 
27 20008636 2015-07-15 2015-08-18  True 
28 20008636 2015-07-15 2015-08-18  True 

这是因为你比较阵列,而不是其and不理解了标值。

而且你试图通过这样做是为了使用布尔面具从条件指数DF:

df['TimeCheck'] = df[(df['First_Date'] >= period_beg) and (df['Last_Date'] <= period_end)] 

其中由于and生成ValueError

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

即使您在上面将and更改为&,这将仅分配True值:

In [10]: 
df['TimeCheck'] = df[(df['First_Date'] >= period_beg) & (df['Last_Date'] <= period_end)] 
df 

Out[10]: 
     Num First_Date Last_Date TimeCheck 
0 20008526 2013-07-03 2013-07-18   NaN 
1 20008526 2013-07-03 2013-07-18   NaN 
2 20008526 2013-07-03 2013-07-18   NaN 
3 20008526 2013-07-03 2013-07-18   NaN 
4 20008526 2013-07-03 2013-07-18   NaN 
5 20008526 2013-07-03 2013-07-18   NaN 
6 20008526 2013-07-03 2013-07-18   NaN 
7 20008526 2013-07-03 2013-07-18   NaN 
8 20008526 2013-07-03 2013-07-18   NaN 
9 20008526 2013-07-03 2013-07-18   NaN 
10 20008526 2013-07-03 2013-07-18   NaN 
11 20008526 2013-07-03 2013-07-18   NaN 
12 20008526 2013-07-03 2013-07-18   NaN 
13 20008534 2014-03-25 2014-05-05   NaN 
14 20008534 2014-03-25 2014-05-05   NaN 
15 20008534 2014-03-25 2014-05-05   NaN 
16 20008534 2014-03-25 2014-05-05   NaN 
17 20008534 2014-03-25 2014-05-05   NaN 
18 20008534 2014-03-25 2014-05-05   NaN 
19 20008534 2014-03-25 2014-05-05   NaN 
20 20008534 2014-03-25 2014-05-05   NaN 
21 20008534 2014-03-25 2014-05-05   NaN 
22 20008534 2014-03-25 2014-05-05   NaN 
23 20008534 2014-03-25 2014-05-05   NaN 
24 20008534 2014-03-25 2014-05-05   NaN 
25 20008534 2014-03-25 2014-05-05   NaN 
26 20008636 2015-07-15 2015-08-18 2.00086e+07 
27 20008636 2015-07-15 2015-08-18 2.00086e+07 
28 20008636 2015-07-15 2015-08-18 2.00086e+07 

这是不是你想要的

而且只有最后3行满足你的病情,不知道为什么你期望与最后日期值的行:7/18/2013 0:00也成为True