2012-12-18 59 views
4

我有两个熊猫DataFrame - weightLand Use列上有一个简单的索引。 concentrationLand UseParameter上有一个MultiIndex。使用相同列,不同索引级别对齐数据框

import pandas 
from io import StringIO 

conc_string = StringIO("""\ 
Land Use,Parameter,1E,1N,1S,2 
Airfield,BOD5 (mg/l),0.418,0.118,0.226,1.063 
Airfield,Ortho P (mg/l),0.002,0.001,0.001,0.002 
Airfield,TSS (mg/l),1.773,11.47,0.862,0.183 
Airfield,Zn (mg/l),0.001,0.001,4.95E-05,0.001 
"Commercial",BOD5 (mg/l),0.036,0.0419,,0.315 
"Commercial",Cu (mg/l),4.37E-05,7.34E-05,,0.00039 
"Commercial",O&G (mg/l),0.0385,0.127,,0.263 
Open Space,TSS (mg/l),0.371,3.01,1.209,0.147 
Open Space,Zn (mg/l),0.0127,0.0069,0.0132,0.007 
"Parking Lot",BOD5 (mg/l),0.924,0.0668,2.603,3.19 
"Parking Lot",O&G (mg/l),1.02,0.149,1.347,1.88 
"Rooftops",BOD5 (mg/l),0.135,1.00,0.0562,0.310""") 

weight_string = StringIO("""\ 
Land Use,1E,1N,1S,2 
Airfield,0.511,0.0227,0.0616,0.394 
Commercial,0.0005,0.1704,0,0.1065 
Open Space,0.0008,0.005,0.0002,0.0004 
"Parking Lot",0.33,0.514,0.252,0.171 
Rooftops,0.081,0.028,8.50E-05,0.003""") 

concentration = pandas.read_csv(conc_string, index_col=[0,1]) 
weight = pandas.read_csv(weight_string, index_col=0) 

在这种情况下,列(1E,1N,1S和2)是流域盆地。

我想要做的是将所有浓度独立于Parameter除以盆地(列名称)和Land Use的权重。

我在这里没有太多运气。 concentration/weight当然不起作用。我没有多少运气堆叠dataframes和加盟两种

wstk = pandas.DataFrame(weight.stack()) 
wstk.index.names = ['Land Use', 'Basin'] 
wstk.rename(columns={0:'weight'}, inplace=True) 

cstk = pandas.DataFrame(concentration.stack()) 
cstk.index.names = ['Land Use', 'Parameter', 'Basin'] 
cstk.rename(columns={0:'concentration'}, inplace=True) 

wstk.join(cstk, on=['Land Use', 'Basin']) # fails 
cstk.join(wstk, on=['Land Use', 'Basin']) # fails 

当我离开关on kwarg最后两行不会引发错误,但对于联接的列返回NaN结果。如果我在两个堆叠的DataFrame上放置索引(例如,在连接之前做了wstk.reset_index(inplace=True)),它们也会失败。

有什么建议吗?

回答

6

使用数据框div方法并传递matchkey你要广播横跨多指标:

从文档div

level : int or name 
    Broadcast across a level, matching Index values on the 
    passed MultiIndex level 

In [39]: concentration.div(weight, level='Land Use') 
Out[39]: 
            1E   1N   1S   2 
Land Use Parameter 
Airfield BOD5 (mg/l)  0.818004 5.198238  3.668831 2.697970 
      Ortho P (mg/l) 0.003914 0.044053  0.016234 0.005076 
      TSS (mg/l)  3.469667 505.286344 13.993506 0.464467 
      Zn (mg/l)   0.001957 0.044053  0.000804 0.002538 
Commercial BOD5 (mg/l)  72.000000 0.245892   NaN 2.957746 
      Cu (mg/l)   0.087400 0.000431   NaN 0.003662 
      O&G (mg/l)  77.000000 0.745305   NaN 2.469484 
Open Space TSS (mg/l)  463.750000 602.000000 6045.000000 367.500000 
      Zn (mg/l)  15.875000 1.380000 66.000000 17.500000 
Parking Lot BOD5 (mg/l)  2.800000 0.129961 10.329365 18.654971 
      O&G (mg/l)  3.090909 0.289883  5.345238 10.994152 
Rooftops BOD5 (mg/l)  1.666667 35.714286 661.176471 103.333333 
+0

感谢您的帮助。 –

相关问题