2015-10-27 150 views
6

我重新采样一个熊猫TimeSeries。时间序列由二进制值(它是一个分类变量)组成,没有缺失值,但在重新采样NaN后出现。这怎么可能?熊猫TimeSeries resample生产NaN

我不能在这里张贴任何示例数据,因为它是敏感的信息,但我创建和重新采样系列如下:

series = pd.Series(data, ts) 
series_rs = series.resample('60T', how='mean') 
+0

如果上采样,则默认为引入'NaN'值,除了没有代表性的样本代码,就很难进一步置评 – EdChum

回答

6

upsampling转换成常规时间间隔,因此,如果没有样品你得到NaN

您可以向后填写缺失值fill_method='bfill'或转发 - fill_method='ffill'fill_method='pad'

import pandas as pd 

ts = pd.date_range('1/1/2015', periods=10, freq='100T') 
data = range(10) 
series = pd.Series(data, ts) 
print series 
#2015-01-01 00:00:00 0 
#2015-01-01 01:40:00 1 
#2015-01-01 03:20:00 2 
#2015-01-01 05:00:00 3 
#2015-01-01 06:40:00 4 
#2015-01-01 08:20:00 5 
#2015-01-01 10:00:00 6 
#2015-01-01 11:40:00 7 
#2015-01-01 13:20:00 8 
#2015-01-01 15:00:00 9 
#Freq: 100T, dtype: int64 
series_rs = series.resample('60T', how='mean') 
print series_rs 
#2015-01-01 00:00:00  0 
#2015-01-01 01:00:00  1 
#2015-01-01 02:00:00 NaN 
#2015-01-01 03:00:00  2 
#2015-01-01 04:00:00 NaN 
#2015-01-01 05:00:00  3 
#2015-01-01 06:00:00  4 
#2015-01-01 07:00:00 NaN 
#2015-01-01 08:00:00  5 
#2015-01-01 09:00:00 NaN 
#2015-01-01 10:00:00  6 
#2015-01-01 11:00:00  7 
#2015-01-01 12:00:00 NaN 
#2015-01-01 13:00:00  8 
#2015-01-01 14:00:00 NaN 
#2015-01-01 15:00:00  9 
#Freq: 60T, dtype: float64 
series_rs = series.resample('60T', how='mean', fill_method='bfill') 
print series_rs 
#2015-01-01 00:00:00 0 
#2015-01-01 01:00:00 1 
#2015-01-01 02:00:00 2 
#2015-01-01 03:00:00 2 
#2015-01-01 04:00:00 3 
#2015-01-01 05:00:00 3 
#2015-01-01 06:00:00 4 
#2015-01-01 07:00:00 5 
#2015-01-01 08:00:00 5 
#2015-01-01 09:00:00 6 
#2015-01-01 10:00:00 6 
#2015-01-01 11:00:00 7 
#2015-01-01 12:00:00 8 
#2015-01-01 13:00:00 8 
#2015-01-01 14:00:00 9 
#2015-01-01 15:00:00 9 
#Freq: 60T, dtype: float64 
+0

THX。那解决了它 –

+0

超级。你可以upvote或接受它 - [info](http://stackoverflow.com/tour) – jezrael

+0

不同的填充方法做什么? 关于它们的熊猫文档相当有限。 ffilll和bfill是不言自明的,但是垫子呢? –