我想选择的数据集的右部为具有以下示例说明:分割由time列的数据帧 - 大熊猫
输入DF:
id_B, ts_B,value
id1,2017-04-27 01:35:30,0
id1,2017-04-27 01:35:40,0
id1,2017-04-27 01:35:50,1
id1,2017-04-27 01:36:00,4
id1,2017-04-27 01:36:10,5
id1,2017-04-27 01:36:20,100
id1,2017-04-27 01:36:30,155
id1,2017-04-27 01:36:40,235
id1,2017-04-27 01:36:50,0
id1,2017-04-27 01:36:60,0
id1,2017-04-27 01:37:00,2353
id1,2017-04-27 01:37:10,221
id1,2017-04-27 01:37:20,2432
id1,2017-04-27 01:37:30,2654
id1,2017-04-27 01:37:40,12
id1,2017-04-27 01:37:50,5
id1,2017-04-27 01:38:00,5
id1,2017-04-27 01:38:10,23
id1,2017-04-27 01:38:20,5
id1,2017-04-27 01:38:30,2
id1,2017-04-27 01:38:40,2
id1,2017-04-27 01:38:50,1
id1,2017-04-27 01:39:00,0
id1,2017-04-27 01:39:10,0
id1,2017-04-27 01:39:20,0
id1,2017-04-27 01:39:30,0
id1,2017-04-27 01:39:40,0
id1,2017-04-27 01:39:50,0
id1,2017-04-27 01:40:00,0
id1,2017-04-27 01:40:10,1
id1,2017-04-27 01:40:20,5
id1,2017-04-27 01:40:30,221
id1,2017-04-27 01:40:40,2432
id1,2017-04-27 01:40:50,2654
id1,2017-04-27 01:40:60,12
id1,2017-04-27 01:41:00,5
id1,2017-04-27 01:41:10,5
id1,2017-04-27 01:41:20,23
id1,2017-04-27 01:41:30,5
id1,2017-04-27 01:41:40,2
id1,2017-04-27 01:41:50,1
考虑以下内容: segment_number = 1
持续时间= 3分钟
我想选择从第一个df.value非零开始的数据框的第一个段,直到覆盖3分钟持续时间的最后一个值。
输出: id1,2017-04-27 01:35:50,1 id1,2017-04-27 01:36:00,4 id1,2017-04-27 01:36:10,5 id1,2017-04-27 01:36:20,100 id1,2017-04-27 01:36:30,155 id1,2017-04-27 01:36:40,235 id1,2017-04-27 01:36:50,0 id1,2017-04-27 01:36:60,0 id1,2017-04-27 01:37:00,2353 id1,2017-04-27 01:37:10,221 id1,2017-04-27 01:37:20,2432 id1,2017-04-27 01:37:30,2654 id1,2017-04-27 01:37:40,12 id1,2017-04-27 01:37:50,5 id1,2017-04-27 01:38:00,5 id1,2017-04-27 01:38:10,23 id1,2017-04-27 01:38:20,5 id1,2017-04-27 01:38:30,2 id1,2017-04-27 01:38:40,2 id1,2017-04-27 01:38:50,1
考虑以下内容: segment_number = 2
持续时间= 1.40分钟再予
我想选择的dateframe从第一df.value非零开始直到所述第二区段最后的值覆盖了1.40分钟的持续时间。
输出:
id1,2017-04-27 01:40:10,1
id1,2017-04-27 01:40:20,5
id1,2017-04-27 01:40:30,221
id1,2017-04-27 01:40:40,2432
id1,2017-04-27 01:40:50,2654
id1,2017-04-27 01:40:60,12
id1,2017-04-27 01:41:00,5
id1,2017-04-27 01:41:10,5
id1,2017-04-27 01:41:20,23
id1,2017-04-27 01:41:30,5
id1,2017-04-27 01:41:40,2
id1,2017-04-27 01:41:50,1
到目前为止,我没有索引DF WRT到ts_B使用`pd.to_datetime和set_index”,并使用一个变量‘last_end_point’,保持了前一段的指数跟踪。
但我没有得到正确的输出。
任何帮助,将不胜感激。
那么,你想拆你的'由递减的时间间隔df'? –
是的,有点。更具体地说,我想按持续时间和起点分开它,第一次是从头开始,第二次是前一次的最后一行的索引。 –
对不起,上一个分段的最后一个原始值+1。但它应该避免用df.value = 0开始段,并始终选择不为零的第一个段。 –