如何从不同的熊猫数据框中选取多列

我有3个熊猫数据框（类似于下图）。我有2所列出list ID_1 = ['sdf', 'sdfsdf', ...]和list ID_2 = ['kjdf', 'kldfjs', ...]如何从不同的熊猫数据框中选取多列

Table1: 
    ID_1 ID_2 Value 
0 PUFPaY9 NdYWqAJ 0.002 
1 Iu6AxdB qANhGcw 0.01 
2 auESFwW jUEUNdw 0.2345 
3 LWbYpca G3uZ_Rg 0.0835 
4 8fApIAM mVHrayg 0.0295 

Table2: 
    ID_1 weight1 weight2 .....weightN 
0 PUFPaY9  
1 Iu6AxdB  
2 auESFwW 
3 LWbYpca  

Table3: 
    ID_2 weight1 weight2 .....weightN 
0 PUFPaY9  
1 Iu6AxdB  
2 auESFwW  
3 LWbYpca

我想有应等来计算一个数据帧，

for each x ID_1 in list1: 
    for each y ID_2 in list2: 
     if x-y exist in Table1: 
      temp_row = (x[weights[i]].* y[weights[i]]) 
      # here i want one to one multiplication, x[weight1]*y[weight1] , x[weight2]*y[weight2] 
      temp_row.append(value[x-y] in Table1) 
      new_dataframe.append(temp_row) 

return new_dataframe

所需new_dataframe应该像表4：

Table4: 
     weight1 weight2 weight3 .....weightN value 
    0   
    1   
    2  
    3

我我现在能够做的是：

new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))] 使用这个我得到所有有效的ID_1和ID_2组合和值。但我不知道，我怎么能从两个数据库中获得权重的乘法（每个weight[i]没有循环）？

现在的任务是比较容易的，我可以遍历new_df和for each row in new_df，我会找到weight[i to n] for ID_1 from table 2和weight[i to n] for ID_2 from table3。然后我可以将one-one multiplication和"value" from table1附加到新的FINAL_DF。但我不想循环和做，我们可以用更聪明的方式解决这个问题吗？

来源

2016-04-09 impossible

在问题已更新。我不确定我们是否有不使用循环的选项。 – impossible

请检查我的答案 – MaxU

是你想要的吗？

data = """\ 
ID_1 
PUFPaY9  
aaaaaaa 
Iu6AxdB  
auESFwW 
LWbYpca 
""" 
id1 = pd.read_csv(io.StringIO(data), delim_whitespace=True) 

data = """\ 
ID_2 
PUFPaY9 
Iu6AxdB 
xxxxxxx 
auESFwW 
LWbYpca 
""" 
id2 = pd.read_csv(io.StringIO(data), delim_whitespace=True) 

cols = ['weight{}'.format(i) for i in range(1,5)] 
for c in cols: 
    id1[c] = np.random.randint(1, 10, len(id1)) 
    id2[c] = np.random.randint(1, 10, len(id2)) 

id1.set_index('ID_1', inplace=True) 
id2.set_index('ID_2', inplace=True) 

df_mul = id1 * id2

一步一步：

In [215]: id1 
Out[215]: 
     weight1 weight2 weight3 weight4 
ID_1 
PUFPaY9  8  9  1  1 
aaaaaaa  6  1  9  2 
Iu6AxdB  8  4  8  5 
auESFwW  9  3  4  2 
LWbYpca  7  7  1  8 

In [216]: id2 
Out[216]: 
     weight1 weight2 weight3 weight4 
ID_2 
PUFPaY9  6  5  5  1 
Iu6AxdB  1  5  4  5 
xxxxxxx  1  2  6  4 
auESFwW  3  9  5  5 
LWbYpca  3  3  6  7 

In [217]: id1 * id2 
Out[217]: 
     weight1 weight2 weight3 weight4 
Iu6AxdB  8.0  20.0  32.0  25.0 
LWbYpca  21.0  21.0  6.0  56.0 
PUFPaY9  48.0  45.0  5.0  1.0 
aaaaaaa  NaN  NaN  NaN  NaN 
auESFwW  27.0  27.0  20.0  10.0 
xxxxxxx  NaN  NaN  NaN  NaN

来源

2016-04-09 12:37:05 MaxU

如何从不同的熊猫数据框中选取多列

回答

相关问题