2012-11-27 41 views
3

类似的问题被问在How to keep index when using pandas merge,但它不会与MultiIndexes工作,即如何保持多指标在使用熊猫合并

a = DataFrame(np.array([1,2,3,4,1,2,3,3]).reshape((4,2)), columns=['col1','to_merge_on'], index=['a','b','a','b']) 
id = pd.MultiIndex.from_arrays([[1,1,2,2],['a','b','a','b']], names =['id1','id2']) 
a.index = id 

In [207]: a 
Out[207]: 
     col1 to_merge_on 
id1 id2     
1 a  1   2 
    b  3   4 
2 a  1   2 
    b  3   4 

b=DataFrame(data={"col2": [1,2,3], 'to_merge_on' : [1,3,5]}) 

In [209]: b 
Out[209]: 
    col2 to_merge_on 
0  1   1 
1  2   3 
2  3   5 

a.reset_index().merge(b, how="left").set_index('index') 

In [208]: a.reset_index().merge(b, how="left").set_index('index') 
------------------------------------------------------------ 
Traceback (most recent call last): 
    File "<ipython console>", line 1, in <module> 
    File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2054, in set_index 
    level = frame[col] 
    File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1458, in __getitem__ 
    return self._get_item_cache(key) 
    File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 294, in _get_item_cache 
    values = self._data.get(item) 
    File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 625, in get 
    _, block = self._find_block(item) 
    File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 715, in _find_block 
    self._check_have(item) 
    File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 722, in _check_have 
    raise KeyError('no item named %s' % str(item)) 
KeyError: 'no item named index' 

怎么能够使合并而左数据框保留多指标?

回答

2

临时溶液:

In [255]: a = a.reset_index() 

In [256]: a 
Out[256]: 
    id1 id2 col1 to_merge_on 
0 1 a  1   2 
1 1 b  3   4 
2 2 a  1   2 
3 2 b  3   4 

In [271]: c = pd.merge(a, b, how="left") 

In [272]: c 
Out[272]: 
    id1 id2 col1 to_merge_on col2 
0 1 a  1   2 NaN 
1 2 a  1   2 NaN 
2 2 b  3   3  2 
3 1 b  3   4 NaN 

In [273]: c = c.set_index(['id1','id2']) 

In [274]: c 
Out[274]: 
     col1 to_merge_on col2 
id1 id2       
1 a  1   2 NaN 
2 a  1   2 NaN 
    b  3   3  2 
1 b  3   4 NaN