2016-12-23 192 views
1

我是熊猫新手,所以如果我听起来太天真了,请原谅。我有两个dataframes DF1和DF2,熊猫数据框在条件分裂

df1 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3'], 
       'key2': ['K5', 'K4', 'K5', 'K4']}) 

df2 = pd.DataFrame({'key1': ['K0', 'K1', 'K2', 'K3', 'K9', 'K8', 'K7'], 
        'key2': ['K5', 'K6', 'K5', 'K4', 'K6', 'K4', 'K5'], 
        'A':['1', '2', '3', '4', '5', '6', '7'], 
        'B':['8', '9', '10', '11', '12', '13', '14']}) 

我想对DF1合并DF2像

final = df1.merge(df2, on=['key1', 'key2'], how='left') 

,然后在DF2剩下的值作为一个数据帧。

任何帮助,将不胜感激。谢谢。

回答

0

IIUC需要外连接与参数indicator,然后通过boolean indexing分裂:

final = df1.merge(df2, how='outer', indicator=True) 
print (final) 
    key1 key2 A B  _merge 
0 K0 K5 1 8  both 
1 K1 K4 NaN NaN left_only 
2 K2 K5 3 10  both 
3 K3 K4 4 11  both 
4 K1 K6 2 9 right_only 
5 K9 K6 5 12 right_only 
6 K8 K4 6 13 right_only 
7 K7 K5 7 14 right_only 

print (final[final._merge == 'right_only']) 
    key1 key2 A B  _merge 
4 K1 K6 2 9 right_only 
5 K9 K6 5 12 right_only 
6 K8 K4 6 13 right_only 
7 K7 K5 7 14 right_only 

print (final[final._merge != 'right_only']) 
    key1 key2 A B  _merge 
0 K0 K5 1 8  both 
1 K1 K4 NaN NaN left_only 
2 K2 K5 3 10  both 
3 K3 K4 4 11  both 

print (final[final._merge == 'right_only'].drop('_merge', axis=1)) 
    key1 key2 A B 
4 K1 K6 2 9 
5 K9 K6 5 12 
6 K8 K4 6 13 
7 K7 K5 7 14 

print (final[final._merge != 'right_only'].drop('_merge', axis=1)) 
    key1 key2 A B 
0 K0 K5 1 8 
1 K1 K4 NaN NaN 
2 K2 K5 3 10 
3 K3 K4 4 11 
+0

如果输出是不同的,因为你需要,你可以编辑的问题,并添加所需的输出?谢谢。 – jezrael

+0

这完美地回答了我的问题。非常感谢,我忽略了指标旗的重要性。 –