2016-07-02 20 views
1

我正在对相关人员的数据框进行一些调整。但是,当我找到兄弟时,我无法管理,我无法找到一种方法将它们写在特定的专栏上。下面就跟随一个例子:Python Pandas:我如何返回群组的成员

cols = ['Name','Father','Brother'] 
df = pd.DataFrame({'Brother':'', 
        'Father':['Erick Moon','Ralph Docker','Erick Moon','Stewart Adborn'], 
        'Name':['John Smith','Rodolph Ruppert','Mathew Common',"Patrick French"]     
        },columns=cols) 

df 
      Name   Father   Brother 
0  John Smith Erick Moon   
1 Rodolph Ruppert Ralph Docker   
2 Mathew Common Erick Moon   
3 Patrick French Stewart Adborn 

我想是这样的:

  Name   Father   Brother 
0  John Smith Erick Moon  Mathew Common  
1 Rodolph Ruppert Ralph Docker   
2 Mathew Common Erick Moon  John Smith 
3 Patrick French Stewart Adborn 

我apreciate任何帮助!

+1

这是否只是数据集包括男性?可以有两个以上的兄弟吗? – ayhan

+0

这个mays是有用的:http://pandas.pydata.org/pandas-docs/stable/reshaping.html –

+0

不,我只是做了一个伪代码。也有女人。另外,这可能不止两个兄弟。我想要重塑。我尝试了groupby,但我无法设法得到其他的兄弟,因为它会重写两次...... – nicmano

回答

1

这是一个想法,你可以尝试,首先创建一个列表与所有兄弟Brother列表包括自己,然后单独删除自己。代码也许可以优化,但在这里你可以从开始:

import numpy as np 
import pandas as pd 
df['Brother'] = df.groupby('Father')['Name'].transform(lambda g: [g.values]) 
def deleteSelf(row): 
    row.Brother = np.delete(row.Brother, np.where(row.Brother == row.Name)) 
    return(row) 
df.apply(deleteSelf, axis = 1) 

#    Name   Father   Brother 
# 0  John Smith  Erick Moon [Mathew Common] 
# 1 Rodolph Ruppert Ralph Docker    [] 
# 2 Mathew Common  Erick Moon  [John Smith] 
# 3 Patrick French Stewart Adborn    [] 
+0

它很棒!我想到一个非常适合的解决方案! – nicmano

0
def same_father(me, data): 
    hasdad = data.Father == data.at[me, 'Father'] 
    notme = data.index != me 
    isbro = hasdad & notme 
    return data.loc[isbro].index.tolist() 

df2 = df.set_index('Name') 
getbro = lambda x: same_father(x.name, df2) 
df2['Brother'] = df2.apply(getbro, axis=1) 

我认为这应该工作(未经测试)