2014-01-22 43 views
2

我有一个数据帧df其最后一行每组(GROUPBY STK_ID)是NaN:如何填补大熊猫每一组的最后一行?

>>> print df 
        sales opr_pft net_pft 
STK_ID RPT_Date       
002138 20130331 2.0703 0.3373 0.2829 
     20130630  NaN  NaN  NaN 
     20130930 7.4993 1.2248 1.1630 
     20140122  NaN  NaN  NaN 
600004 20130331 11.8429 3.0816 2.1637 
     20130630 24.6232 6.2152 4.5135 
     20130930 37.9673 9.2088 6.6463 
     20140122  NaN  NaN  NaN 
600809 20130331 27.9517 9.9426 7.5182 
     20130630 40.6460 13.9414 9.8572 
     20130930 53.0501 16.8081 11.8605 
     20140122  NaN  NaN  NaN 

现在我想fillna每个组与其先前行的最后一排,结果应该是这样的:

    sales opr_pft net_pft 
STK_ID RPT_Date       
002138 20130331 2.0703 0.3373 0.2829 
     20130630  NaN  NaN  NaN **(Not fillna this row)** 
     20130930 7.4993 1.2248 1.1630 
     20140122 7.4993 1.2248 1.1630 
600004 20130331 11.8429 3.0816 2.1637 
     20130630 24.6232 6.2152 4.5135 
     20130930 37.9673 9.2088 6.6463 
     20140122 37.9673 9.2088 6.6463 
600809 20130331 27.9517 9.9426 7.5182 
     20130630 40.6460 13.9414 9.8572 
     20130930 53.0501 16.8081 11.8605 
     20140122 53.0501 16.8081 11.8605 

我几乎把它所做:df.groupby(level=0).apply(lambda grp: grp.fillna(method='ffill')),其下面产生:

    sales opr_pft net_pft 
STK_ID RPT_Date       
002138 20130331 2.0703 0.3373 0.2829 
     20130630 2.0703 0.3373 0.2829 
     20130930 7.4993 1.2248 1.1630 
     20140122 7.4993 1.2248 1.1630 
600004 20130331 11.8429 3.0816 2.1637 
     20130630 24.6232 6.2152 4.5135 
     20130930 37.9673 9.2088 6.6463 
     20140122 37.9673 9.2088 6.6463 
600809 20130331 27.9517 9.9426 7.5182 
     20130630 40.6460 13.9414 9.8572 
     20130930 53.0501 16.8081 11.8605 
     20140122 53.0501 16.8081 11.8605 

这不是我想要的,它通过组内的行来填充。那么如何填补Pandas每个小组的最后一行?

回答

5

你可以在GROUPBY使用其他功能:

def f(g): 
    last = len(g.values)-1 
    g.iloc[last,:] = g.iloc[last-1,:] 
    return g 
print df.groupby(level=0).apply(f) 

输出:

    sales opr_pft net_pft 
STK_ID RPT_Date       
2138 20130331 2.0703 0.3373 0.2829 
     20130630  NaN  NaN  NaN 
     20130930 7.4993 1.2248 1.1630 
     20140122 7.4993 1.2248 1.1630 
600004 20130331 11.8429 3.0816 2.1637 
     20130630 24.6232 6.2152 4.5135 
     20130930 37.9673 9.2088 6.6463 
     20140122 37.9673 9.2088 6.6463 
600809 20130331 27.9517 9.9426 7.5182 
     20130630 40.6460 13.9414 9.8572 
     20130930 53.0501 16.8081 11.8605 
     20140122 53.0501 16.8081 11.8605 
+0

谢谢,它的工作原理。 – bigbug

+0

将'g.iloc [last ,:] = g.iloc [last-1 ,:]'改为'g.iloc [last-1:,:]。fillna(method ='ffill',inplace = True)' – bigbug