2017-05-19 20 views
4

我有四个大熊猫DataFrames用数值列和指标:如何按列和索引连接Pandas DataFrames?

A = pd.DataFrame(data={"435000": [9.792, 9.795], "435002": [9.825, 9.812]}, index=[119000, 119002]) 
B = pd.DataFrame(data={"435004": [9.805, 9.783], "435006": [9.785, 9.78]}, index=[119000, 119002]) 
C = pd.DataFrame(data={"435000": [9.778, 9.743], "435002": [9.75, 9.743]}, index=[119004, 119006]) 
D = pd.DataFrame(data={"435004": [9.743, 9.743], "435006": [9.762, 9.738]}, index=[119004, 119006]) 

enter image description here

我想将它们连接成这样一个数据帧,两个列名和索引匹配:

enter image description here

如果我尝试pd.concat这四个dfs,它们会被堆叠(上面和下面或侧面,具体取决于axis),我结束了NaN值在DF:

result = pd.concat([A, B, C, D], axis=0) 

enter image description here

如何使用pd.concat(或mergejoin等),以获得正确的结果呢?

回答

3

你需要对CONCAT:

result = pd.concat([pd.concat([A, C], axis=0), pd.concat([B, D], axis=0)], axis=1) 
print (result) 
     435000 435002 435004 435006 
119000 9.792 9.825 9.805 9.785 
119002 9.795 9.812 9.783 9.780 
119004 9.778 9.750 9.743 9.762 
119006 9.743 9.743 9.743 9.738 

更好的为stack + concat + unstack

result = pd.concat([A.stack(), B.stack(), C.stack(), D.stack()], axis=0).unstack() 
print (result) 
     435000 435002 435004 435006 
119000 9.792 9.825 9.805 9.785 
119002 9.795 9.812 9.783 9.780 
119004 9.778 9.750 9.743 9.762 
119006 9.743 9.743 9.743 9.738 

更多动态:

dfs = [A,B,C,D] 
result = pd.concat([df.stack() for df in dfs], axis=0).unstack() 
print (result) 
     435000 435002 435004 435006 
119000 9.792 9.825 9.805 9.785 
119002 9.795 9.812 9.783 9.780 
119004 9.778 9.750 9.743 9.762 
119006 9.743 9.743 9.743 9.738 
+0

非常感谢,动态版本是完美的。 – user2950747

+0

很高兴能帮到你,美好的一天!顺便说一句,非常好的问题,丰富多彩;) – jezrael

1

您可以使用加入过:

pd.concat((A.join(B), C.join(D))) 
Out: 
     435000 435002 435004 435006 
119000 9.792 9.825 9.805 9.785 
119002 9.795 9.812 9.783 9.780 
119004 9.778 9.750 9.743 9.762 
119006 9.743 9.743 9.743 9.738 
相关问题