2014-01-12 50 views
8

我正在构建一个新方法来将DataFrame解析为Vincent兼容格式。这需要一个标准的Index(文森特不能解析MultiIndex)。检测数据帧是否具有MultiIndex

有没有办法检测熊猫人DataFrame是否有MultiIndex

In: type(frame) 
Out: pandas.core.index.MultiIndex 

我已经试过:

In: if type(result.index) is 'pandas.core.index.MultiIndex': 
     print True 
    else: 
     print False 
Out: False 

如果我尝试中没有报价,我得到:

NameError: name 'pandas' is not defined 

任何帮助表示赞赏。

(一旦我有MultiIndex,我再重新索引和两列合并成演示阶段的单一字符串值。)

+0

'名“大熊猫”不是第一defined'你应该'进口pandas'! – Winand

回答

13

您可以使用isinstance来检查对象是否是类(或其子类):

if isinstance(result.index, pandas.core.index.MultiIndex): 
+0

适合我 - 这让我意识到自己的错误,因为我习惯于以简写形式导入熊猫作为'pd'。如果isinstance(result.index,pd.core.index.MultiIndex): –

0

也许最简单的办法是if type(result.index)==pd.MultiIndex:

2

还有

len(result.index.names) > 1 

,但它是比任何isinstance或类型要慢得多:

timeit(len(result.index.names) > 1) 
The slowest run took 10.95 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000000 loops, best of 3: 1.12 µs per loop 
In [254]: 

timeit(isinstance(result.index, pd.MultiIndex)) 
The slowest run took 30.53 times longer than the fastest. This could mean that an intermediate result is being cached. 
10000000 loops, best of 3: 177 ns per loop 
In [252]: 

) 
timeit(type(result.index) == pd.MultiIndex) 
The slowest run took 22.86 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000000 loops, best of 3: 200 ns per loop 
+1

您应该注意到,len在纳秒内,len实际需要1,120 ns。 –

+1

Arrgh! Thankyou @JohnCEarls,那些讨厌的友好单位转换器不断吸引我的目光(提升unittest做类似的事情)。 – danio

+0

MultiIndex可以不只有一个级别吗? – Konstantin