2015-11-17 280 views
1

下似乎像它应该工作,但不会:添加新列集大熊猫数据帧与多指标列

import pandas as pd 
import numpy as np 

df = pd.DataFrame() 
for l1 in ('a', 'b'): 
    for l2 in ('one', 'two'): 
     df[l1, l2] = np.random.random(size=5) 
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2']) 

df['difference'] = df['b']-df['a'] 

我得到以下错误:

ValueError: Wrong number of items passed 2, placement implies 1 

我能得到在这附近做:

difference = df['b']-df['a'] 
df['difference', 'one'] = difference['one'] 
df['difference', 'two'] = difference['two'] 

但这似乎效率低下。有没有更高效的方法?

回答

0

您可以一次做到这一点:

In [11]: df[[("difference", "one"), ("difference", "two")]] = df['b'] - df['a'] 

In [12]: df 
Out[12]: 
L1   a     b   difference 
L2  one  two  one  two  one  two 
0 0.585409 0.563870 0.535770 0.868020 -0.049639 0.304150 
1 0.404546 0.102884 0.254945 0.362751 -0.149601 0.259868 
2 0.475362 0.601632 0.476761 0.665126 0.001400 0.063495 
3 0.926288 0.615655 0.257977 0.668778 -0.668311 0.053123 
4 0.509069 0.706685 0.355842 0.891862 -0.153227 0.185177 

更一般可以用一个多指标如生成from_product

In [21]: m = pd.MultiIndex.from_product(["difference", ["one", "two"]]) 

In [22]: df[m] = df['b'] - df['a'] 

其中RHS可以结果.columns。