2014-12-24 141 views
1

鉴于以下pandas.core.frame.DataFrame,叫sorted_by_diff如何重建索引多索引熊猫数据框?

In [10]:sorted_by_diff.head(4) 

Out[10]: 
      value 
y   0   1   diff 
variable    
george  1.265265 0.001550 1.263716 
hp   0.895473 0.017479 0.877994 
hpl  0.431994 0.009173 0.422822 
re   0.415760 0.125091 0.290669 

具有下列:

In [11]: sorted_by_diff.columns 
Out[11]: 
MultiIndex(levels=[[u'value'], [0, 1, u'diff']], 
      labels=[[0, 0, 0], [0, 1, 2]], 
      names=[None, u'y']) 

与以下指标:

In [12]: sorted_by_diff.index 

Out[12]: 
Index([u'george', u'hp', u'hpl', u're', u'edu', u'meeting', u'650', u'85', u'lab', u'labs', u'1999', u'data', u'project', u'technology', u'pm', u'telnet', u'address', u'857', u'415', u'cs', u'original', u'(', u'conference', u'direct', u';', u'[', u'parts', u'table', u'will', u'report', u'#', u'make', u'people', u'receive', u'addresses', u'over', u'order', u'$', u'3d', u'internet', u'mail', u'font', u'money', u'credit', u'all', u'email', u'business', u'000', u'remove', u'our', u'!', u'free', u'your', u'you', u'length_average', u'length_longest', u'length_total'], dtype='object') 

如何rexindex sorted_by_diff是这样的吗?

  value 
y   email  spam  diff 
variable    
george  1.265265 0.001550 1.263716 
hp   0.895473 0.017479 0.877994 
hpl  0.431994 0.009173 0.422822 
re   0.415760 0.125091 0.290669 

也就是说,如何分别将索引级别0和1更改为'email'和'spam'?

回答

1
>>> sorted_by_diff.columns.set_levels([[u'value'], ['email', 'spam', 'diff']], inplace=True) 
>>> sorted_by_diff 
      value      
y   email  spam  diff 
variable        
george 1.265265 0.001550 1.263716 
hp  0.895473 0.017479 0.877994 
hpl  0.431994 0.009173 0.422822 
re  0.415760 0.125091 0.290669 
+0

这就解决了我的问题!谢谢罗马! – fabraz