熊猫计算逻辑或基于多个列

这是一个DataFrame我的工作：熊猫计算逻辑或基于多个列

{'[email protected]': ['AAA', nan, nan], 
'[email protected]': [nan, 'BBB', nan], 
'[email protected]': [nan, nan, 'CCC'], 
'[email protected]':[1,nan,nan], 
'[email protected]':[nan,2,nan], 
'[email protected]':[nan,nan, 3] 
}

我想创建一个Recipe列，其中将包括基于行的值不NaN配方。例如，第一行的值将是AAA，第二行 - BBB等。在DF中还有其他列，但Recipe列应仅考虑到上述3个列。

来源

2015-12-26 Felix

因此，每行中总是只有一个非nan值，并且您希望在新列中使用该值？ – itzy

这是正确的。谢谢 – Felix

你可以在这个例子中使用'df.max（）'，但你可能正在寻找一个更通用的解决方案。 – itzy

您可以使用apply和axis=1申请与any方法的行，如果您只有一个有效值，并且所有其他都是NaN（使用@Stefan例如）：

In [197]: df 
Out[197]: 
    [email protected] [email protected] [email protected] other_col 
0  AAA  NaN  NaN   1 
1  NaN  BBB  NaN   2 
2  NaN  NaN  CCC   3 

In [199]: df['new'] = df[['[email protected]', '[email protected]', '[email protected]']].apply(lambda x: x.any(), axis=1) 

In [200]: df 
Out[200]: 
    [email protected] [email protected] [email protected] other_col new 
0  AAA  NaN  NaN   1 AAA 
1  NaN  BBB  NaN   2 BBB 
2  NaN  NaN  CCC   3 CCC

编辑

这是一个看起来有点像一个黑客，但我认为应该工作（主叫min如果dtype是数字或者any）：

df['new'] = df[['[email protected]', '[email protected]', '[email protected]']].apply(lambda x: x.min() if x.dtype.kind in 'biufc' else x.any(), axis=1) 

In [551]: df 
Out[551]: 
    [email protected] [email protected] [email protected] [email protected] [email protected] \ 
0    1   NaN   NaN  AAA  NaN 
1   NaN    2   NaN  NaN  BBB 
2   NaN   NaN    3  NaN  NaN 

    [email protected] new 
0  NaN 1 
1  NaN 2 
2  CCC 3

备注：dtype.kind

来源

2015-12-26 18:27:22

谢谢你，安东。完美的作品。 – Felix

Anton，这个解决方案可以很好地处理字符串。数值不稳定。需要在您的解决方案中更改哪些内容才能从行值中提取数值？谢谢 – Felix

你能举个适当的例子吗？你可以用'astype（str）'将所有内容转换为字符串，但这不是一个好方法... –

一个简单的解决办法是：

df = pd.DataFrame({'[email protected]': ['AAA', np.nan, np.nan], '[email protected]': [np.nan, 'BBB', np.nan], '[email protected]': [np.nan, np.nan, 'CCC'], 'other_col': [1, 2, 3]}) 

    [email protected] [email protected] [email protected] other_col 
0  AAA  NaN  NaN   1 
1  NaN  BBB  NaN   2 
2  NaN  NaN  CCC   3

只是通过rows迭代，并使用.dropna摆脱缺失值，你可以写一个新的DataFrame列像这样的：

for i, data in df.iterrows(): 
    df.loc[i, 'Recipe'] = data[['[email protected]', '[email protected]', '[email protected]']].dropna().values[0] 

    [email protected] [email protected] [email protected] other_col Recipe 
0  AAA  NaN  NaN   1 AAA 
1  NaN  BBB  NaN   2 BBB 
2  NaN  NaN  CCC   3 CCC

来源

2015-12-26 01:53:13 Stefan

谢谢Stefan，但这似乎不起作用。我想获得食谱栏作为DF栏 – Felix

你得到的错误是什么？用“熊猫0.17.1”为我工作。 – Stefan

DF没有配方栏。数据框仍然需要包含'Recipe @ 123'，'Recipe @ 234'和'Recipe @ 456' – Felix

熊猫计算逻辑或基于多个列

回答

相关问题