我有3个熊猫数据框(类似于下图)。我有2所列出list ID_1 = ['sdf', 'sdfsdf', ...]
和list ID_2 = ['kjdf', 'kldfjs', ...]
如何从不同的熊猫数据框中选取多列
Table1:
ID_1 ID_2 Value
0 PUFPaY9 NdYWqAJ 0.002
1 Iu6AxdB qANhGcw 0.01
2 auESFwW jUEUNdw 0.2345
3 LWbYpca G3uZ_Rg 0.0835
4 8fApIAM mVHrayg 0.0295
Table2:
ID_1 weight1 weight2 .....weightN
0 PUFPaY9
1 Iu6AxdB
2 auESFwW
3 LWbYpca
Table3:
ID_2 weight1 weight2 .....weightN
0 PUFPaY9
1 Iu6AxdB
2 auESFwW
3 LWbYpca
我想有应等来计算一个数据帧,
for each x ID_1 in list1:
for each y ID_2 in list2:
if x-y exist in Table1:
temp_row = (x[weights[i]].* y[weights[i]])
# here i want one to one multiplication, x[weight1]*y[weight1] , x[weight2]*y[weight2]
temp_row.append(value[x-y] in Table1)
new_dataframe.append(temp_row)
return new_dataframe
所需new_dataframe应该像表4:
Table4:
weight1 weight2 weight3 .....weightN value
0
1
2
3
我我现在能够做的是:
new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))]
使用这个我得到所有有效的ID_1
和ID_2
组合和值。但我不知道,我怎么能从两个数据库中获得权重的乘法(每个weight[i]
没有循环)?
现在的任务是比较容易的,我可以遍历new_df
和for each row in new_df
,我会找到weight[i to n] for ID_1 from table 2
和weight[i to n] for ID_2 from table3
。然后我可以将one-one multiplication
和"value" from table1
附加到新的FINAL_DF
。但我不想循环和做,我们可以用更聪明的方式解决这个问题吗?
在问题已更新。我不确定我们是否有不使用循环的选项。 – impossible
请检查我的答案 – MaxU