Python熊猫合并或concat数据帧

我有一系列的csv，我加载到数据框和存储在列表（dataframesArray）。列表和dataframes看起来像如下：Python熊猫合并或concat数据帧

dataframesArray [    
    BBG.XAMS.UL.S_pnl_pos_cost 
     date         
     2015-03-23     0.000000 
     2015-03-24     0.000000 
     2015-03-25     -0.674717 
     2015-03-26     69.140999 
     2015-03-27     -70.128728,    
    BBG.XAMS.UNA.S_pnl_pos_cost 
     date         
     2015-03-23     -0.674929 
     2015-03-24     -15.138444 
     2015-03-25     90.830662 
     2015-03-26     21.446129 
     2015-03-27     -2.554376,    
    BBG.XAMS.UL.S_pnl_pos_cost 
     date         
     2014-10-20     -15.220730 
     2014-10-21     3031.610010 
     2014-10-22     1976.815412 
     2014-10-23    -2974.037294 
     2014-10-24     796.775000, 
    BBG.XAMS.UNA.S_pnl_pos_cost 
     date         
     2014-10-20     -4.140378 
     2014-10-21     618.064066 
     2014-10-22     -71.104800 
     2014-10-23     828.063647 
     2014-10-24      0.000000]

的数据是2个产品（BBG.XAMS.UL.S_pnl_pos_cost和BBG.XAMS.UNA.S_pnl_pos_cost）按日期，在未来会有更多产品。我想Concat的或合并（不知道哪个）dataframes列表到一个数据帧（所谓的结果），因此它们看起来像：

  BBG.XAMS.UL.S_pnl_pos_cost BBG.XAMS.UNA.S_pnl_pos_cost date                 
2014-10-20     -15.220730      -4.140378 
2014-10-21    3031.610010     618.064066 
2014-10-22    1976.815412     -71.104800 
2014-10-23    -2974.037294     828.063647 
2014-10-24     796.775000      0.000000 
2015-03-23     0.000000     -0.674929 
2015-03-24     0.000000     -15.138444 
2015-03-25     -0.674717     90.830662 
2015-03-26     69.140999     21.446129 
2015-03-27     -70.128728     -2.554376

我想用下面这样做：

result = pd.concat(dataframesArray,axis=1)

其中axis是日期。它看起来像数据按日期合并，但我错过了2015-03-23开始的一周的数据。我现在的CONCAT结果数据框的样子：

BBG.XAMS.UL.S_pnl_pos_cost BBG.XAMS.UNA.S_pnl_pos_cost 
date                 
2014-10-20     -15.220730     -4.140378 
2014-10-21     3031.610010     618.064066 
2014-10-22     1976.815412     -71.104800 
2014-10-23    -2974.037294     828.063647 
2014-10-24     796.775000      0.000000 
2015-03-23       NaN       NaN 
2015-03-24       NaN       NaN 
2015-03-25       NaN       NaN 
2015-03-26       NaN       NaN 
2015-03-27       NaN       NaN

我目前的代码是：

stockPricesDf=pd.read_csv(f,engine='c',header=0,index_col=0, parse_dates=True, infer_datetime_format=True,usecols=(0,3)) 

       stockPricesDf.rename(columns={'adjusted_last_acc': row},inplace=True)  

       dataframesArray.append(stockPricesDf) 

       result = pd.concat(dataframesArray,axis=1)

我循环尽管一些目录获取存储在CSV文件中的产品数据。

可能有人请让我知道我做错了，以及如何解决

非常感谢

来源

2015-08-31 Stacey

尝试使用axis = 0。如果每个数据帧具有相同的列名，则这应该按列逐列进行连接。 – Maximus

[Pandas join/merge/concat two dataframes]可能的重复（http://stackoverflow.com/questions/11637384/pandas-join-merge-concat-two-dataframes） –

试试这个：

result = pd.concat(dataframesArray, axis=1) # like you did 
result = result.groupby(result.columns, axis=1).sum()

如您所见，第一步做到这一点（编号）：

    UL  UNA  UL  UNA 
2015-03-23 2.169534 0.294107  NaN  NaN 
2015-03-24 -0.077550 -0.758760  NaN  NaN 
2015-03-25 0.159659 -3.167541  NaN  NaN 
2015-03-26 0.895535 0.944644  NaN  NaN 
2015-03-27 -0.385408 -0.005069  NaN  NaN 
2015-10-20  NaN  NaN 1.855446 -0.229635 
2015-10-21  NaN  NaN -0.400450 -0.237323 
2015-10-22  NaN  NaN 1.103165 0.718134 
2015-10-23  NaN  NaN -0.157415 1.119828 
2015-10-24  NaN  NaN -0.016321 -0.371061

第二步将分组名称的列分组到单列：

    UL  UNA 
2015-03-23 2.169534 0.294107 
2015-03-24 -0.077550 -0.758760 
2015-03-25 0.159659 -3.167541 
2015-03-26 0.895535 0.944644 
2015-03-27 -0.385408 -0.005069 
2015-10-20 1.855446 -0.229635 
2015-10-21 -0.400450 -0.237323 
2015-10-22 1.103165 0.718134 
2015-10-23 -0.157415 1.119828 
2015-10-24 -0.016321 -0.371061

来源

2015-09-01 14:41:59 IanS

谢谢Ian，那个点击 – Stacey

Python熊猫合并或concat数据帧

回答

相关问题