2017-10-28 138 views
1

我一直在试图绘制来自Pandas数据帧的简单resampled数据。这是我最初的代码:无法重新采样,然后绘制熊猫数据帧

import pandas as pd 
import numpy as np 
from datetime import datetime, timedelta 

# Extra plotly bits 
import plotly 
import plotly.plotly as py 
import plotly.graph_objs as go 

date_today = datetime.now() 
days = pd.date_range(date_today, date_today + timedelta(56), freq='D') 

np.random.seed(seed=1111) 
data = np.random.randint(1, high=100, size=len(days)) 
df = pd.DataFrame({'date': days, 'value': data}) 

当我做print df我得到这个:

      date value 
0 2017-10-28 17:13:23.867396  29 
1 2017-10-29 17:13:23.867396  56 
2 2017-10-30 17:13:23.867396  82 
3 2017-10-31 17:13:23.867396  13 
4 2017-11-01 17:13:23.867396  35 
5 2017-11-02 17:13:23.867396  53 
6 2017-11-03 17:13:23.867396  25 
7 2017-11-04 17:13:23.867396  23 
8 2017-11-05 17:13:23.867396  21 
9 2017-11-06 17:13:23.867396  12 
10 2017-11-07 17:13:23.867396  15 
... 
48 2017-12-15 17:13:23.867396  1 
49 2017-12-16 17:13:23.867396  88 
50 2017-12-17 17:13:23.867396  94 
51 2017-12-18 17:13:23.867396  48 
52 2017-12-19 17:13:23.867396  26 
53 2017-12-20 17:13:23.867396  65 
54 2017-12-21 17:13:23.867396  53 
55 2017-12-22 17:13:23.867396  54 
56 2017-12-23 17:13:23.867396  76 

而且我可以画出这个容易(在下面的示例图像红线)。然而,当我试图创建一个额外的数据层时,这个问题就开始了,这个数据层是价值/日期关系的下采样版本,就像每5天跳过一次然后再绘制它们一样。

对于这一点,我创建了我的数据帧的一个采样副本:

df_sampled = df.set_index('date').resample('5D').mean() 

当我做print df_sampled,我得到:

      value 
date 
2017-10-28 17:32:39.622881 43.0 
2017-11-02 17:32:39.622881 26.8 
2017-11-07 17:32:39.622881 26.6 
2017-11-12 17:32:39.622881 59.4 
2017-11-17 17:32:39.622881 66.8 
2017-11-22 17:32:39.622881 33.6 
2017-11-27 17:32:39.622881 27.8 
2017-12-02 17:32:39.622881 64.4 
2017-12-07 17:32:39.622881 43.2 
2017-12-12 17:32:39.622881 64.4 
2017-12-17 17:32:39.622881 57.2 
2017-12-22 17:32:39.622881 65.0 

在这之后,我真的不能绘制此这个专栏似乎已经被打破了。随着plotly:

x = df_sampled['date'], 
    y = df_sampled['value'], 

我得到这个错误:

File "interpolation.py", line 36, in <module> 
    x = df_sampled['date'], 
... 
KeyError: 'date' 

我怎样才能解决这个问题这一点。基本上,我试图创造这个形象。红线是我的原始数据,蓝线是下采样和平滑版本。

enter image description here

--- UPDATE ---

以下工作提供了答案,并且我得到以下结果:

enter image description here

回答

2

date不列,但index,所以需要:

x = df_sampled.index 
y = df_sampled['value'] 

或者通过reset_index创建index柱:

df_sampled = df.set_index('date').resample('5D').mean().reset_index() 
#alternative 
#df_sampled = df.resample('5D', on='date').mean().reset_index() 

x = df_sampled['date'] 
y = df_sampled['value'] 
+1

该解决方案为我工作。谢谢! – symbolix