使用熊猫替换列中值基于其他两列中的值

我试图找到解决方案来解决我的问题，但总结得不多。请让我知道它是否存在其他地方。使用熊猫替换列中值基于其他两列中的值

我有4列的数据帧，这样的：

'A' 'B' 'C'  'D' 

cheese 5  grapes 7 
grapes 7  cheese 8 
steak 1  eggs  21 
eggs 2  steak  1

在“C”和“d”的条目必须在“A”和“B”值匹配，但不是由行;例如，如果“奶酪”在“B”中具有“5”，则“奶酪”在“D”中不能具有“8”。如果不匹配，则必须将“C”和“D”值更正为默认值。在这种情况下，应该更正“奶酪”，以便C：默认和D：0。与鸡蛋和葡萄一样。不过，牛排很好。

所以输出应该是这样的：

'A' 'B' 'C'  'D' 
cheese 5 grapes 7 
grapes 7 default 0 
steak 1 default 0 
eggs 2 steak 1

我想“A”和“B”转换成列表具有唯一值，然后试图替换“C”和基于“d”值在名单上。我尝试了所有可以在stackoverflow上找到的条件df.replace（）技巧，但没有提供任何内容。

非常感谢您提供的任何帮助。

来源

2017-05-13 crimins

是有可能列'C'有两行用'steak'？如果是这样，代码的行为应该是什么？ –

@ViníciusAguiar：是的，列'C'可以有多行且有任何条目。葡萄，牛排，鸡蛋等都可以在'C'中多次出现，可能有多个相应的'D'值。数据是不可预测的肮脏。 A \ B'对是唯一的。代码应找到所有不匹配“A \ B”对的“C/D”对，并将它们改正为默认值\ 0。 – crimins

设置

df = pd.DataFrame({'A': {0: 'cheese', 1: 'grapes', 2: 'steak', 3: 'eggs'}, 
'B': {0: 5, 1: 7, 2: 1, 3: 2}, 
'C': {0: 'grapes', 1: 'default', 2: 'default', 3: 'steak'}, 
'D': {0: 7, 1: 0, 2: 0, 3: 1}}) 

df 
Out[1262]: 
     A B  C D 
0 cheese 5 grapes 7 
1 grapes 7 default 0 
2 steak 1 default 0 
3 eggs 2 steak 1

解决方案

#find rows where df.C should be set to default. 
df.C = df.apply(lambda x: x.C if ((x.C not in df.A.tolist()) or (x.D==df.loc[df.A==x.C, 'B'].iloc[0])) else 'default', axis=1) 
#set df.D to 0 for df.C == default 
df.loc[df.C=='default','D']=0 

df 
Out[1259]: 
     A B  C D 
0 cheese 5 grapes 7 
1 grapes 7 default 0 
2 steak 1 default 0 
3 eggs 2 steak 1

来源

2017-05-13 23:22:37 Allen

使用熊猫替换列中值基于其他两列中的值

回答

相关问题