2017-02-04 74 views
0

我有数据形式:如何搜索熊猫数据框以填充另一个数据框?

President    Years Executive Orders 
George Washington 1789-1797  8 
John Adams   1797-1801  1 
Thomas Jefferson 1801-1809  4 
       ... 

了岁月的字符串格式,我想创建一个新的数据帧中,每年进行重新取样像下面这样我就可以创建通过行政命令的阴谋这些年来(我想插,因为数据不例如1801-1809之间给出数据):

Year Executive Orders 
1789   8 
1790   0 
1791   0 
... 

基本上我想要做像在第一DF在第二DF日期查找并看看有多少订单。有任何想法吗?

由于

回答

0
import pandas as pd 
import numpy as np 
from io import StringIO 

data = '''\ 
President    Years Executive Orders 
George Washington 1789-1797  8 
John Adams   1797-1801  1 
Thomas Jefferson 1801-1809  4 
''' 
df = pd.read_csv(StringIO(data), sep=r'\s+') 

df[['From', 'To']] = df['Executive'].str.split('-', expand=True) 
df['From'] = pd.to_datetime(df['From']) 
df['To'] = pd.to_datetime(df['To']) 

df_orders = df[['Orders', 'From']].set_index('From') 

这导致以下数据帧

  Orders 
From    
1789-01-01  8 
1797-01-01  1 
1801-01-01  4 

由于索引列是列resample可以用作需要重新取样数据的日期。有关如何重新采样数据,请参见docs

df_orders_resampled = df_orders.resample('AS').sum().fillna(0) 

      Orders 
From    
1789-01-01  8.0 
1790-01-01  0.0 
1791-01-01  0.0 
1792-01-01  0.0 
1793-01-01  0.0 
1794-01-01  0.0 
1795-01-01  0.0 
1796-01-01  0.0 
1797-01-01  1.0 
1798-01-01  0.0 
1799-01-01  0.0 
1800-01-01  0.0 
1801-01-01  4.0 

df_orders_resampled.plot() 

plot

相关问题