2015-05-21 62 views
0

我有这样一个事务的数据帧使用GROUPBY()。总和()结果来操纵原始数据帧

branch  daqu from to  style color size amount 
5 huadong shanghai C30C C30F EEBW52301M  39 165  3 
8 huadong shanghai C30F C306 EEBW52301M  51 160  2 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
9 huadong shanghai C30G C30C EEBW52301M  51 170  1 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
7 huadong shanghai C30J C30D EEBW52301M  39 170  2 
6 huadong shanghai C30J C30F EEBW52301M  39 170  4 
3 huadong shanghai C30K C306 EEBW52301M  39 165  1 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 

的数据意味着我们需要发送“量”风格/颜色/尺寸的量产品从'从'商店到'到'商店。

然后我做了groupby'from'和'to',这样我就可以看到每个盒子里会放多少个产品。

print dh_final[['from', 'to', 'amount']].groupby(['from', 'to']).sum() 

      amount 
from to   
C30C C30F  3 
C30F C306  2 
C30G C306  10 
    C30C  1 
    C30F  7 
C30J C30D  2 
    C30F  4 
C30K C306  1 
    C30F  13 

最后,如果从一个店到另一个箱子具有小于5的产品,我想取消与箱相关的交易。那就是我必须从原始数据框中删除行。如果我手动执行,结果应该看起来像这样。

branch  daqu from to  style color size amount 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 

有没有简单的方法可以做到这一点?如何使用groupby()。sum()的结果来操作原始数据框?

回答

1

如果我正确理解你想要的是:

In [53]: 
df['sum'] = df.groupby(['from', 'to'])['amount'].transform('sum') 
df[df['sum'] > 5] 

Out[53]: 
    branch  daqu from to  style color size amount sum 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 13 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6 13 

所以我在这里呼吁transformgroupby对象返回了一系列原创DF加入“和”列上排列,然后我就可以过滤和往常一样。

编辑

其实我觉得你可以做到这一点作为一个班轮:

In [67]: 
df[df.groupby(['from', 'to'])['amount'].transform('sum') > 5] 

Out[67]: 
    branch  daqu from to  style color size amount 
2 huadong shanghai C30G C306 EEBW52301M  39 165  10 
1 huadong shanghai C30G C30F EEBW52301M  39 160  7 
0 huadong shanghai C30K C30F EEBW52301M  39 160  7 
4 huadong shanghai C30K C30F EEBW52301M  39 165  6