2017-11-03 59 views
1

在oracle中,我可以为父子层次结构做类似的事情。这非常方便。现在,我在大熊猫数据框做相当于现在python dataframe做类似oracle connect_by的事情?

with tab1zx as (
select 1 as id, null as parent_id from dual union 
select 2 as id, 1 as parent_id from dual union 
select 3 as id, 2 as parent_id from dual union 
select 4 as id, 2 as parent_id from dual union 
select 5 as id, 4 as parent_id from dual union 
select 6 as id, 4 as parent_id from dual union 
select 7 as id, 1 as parent_id from dual union 
select 8 as id, 7 as parent_id from dual union 
select 9 as id, 1 as parent_id from dual union 
select 10 as id, 9 as parent_id from dual union 
select 11 as id, 10 as parent_id from dual union 
select 12 as id, 9 as parent_id from dual 

) 
--select * from tab1zx 
SELECT id, 
     parent_id, 
     RPAD('.', (level-1)*2, '.') || id AS tree, 
     level, 
     CONNECT_BY_ROOT id AS root_id, 
     LTRIM(SYS_CONNECT_BY_PATH(id, '-'), '-') AS path, 
     CONNECT_BY_ISLEAF AS leaf 
FROM tab1zx 
START WITH parent_id IS NULL 
CONNECT BY parent_id = PRIOR id 
ORDER SIBLINGS BY id; 

,如果我有一个Python数据帧我怎么做这个

aa={ 
    'id':["1","2","3","4","5","6","7","8","9","10","11","12"] 
    ,'parent_id':["","1","2","2","4","4","1","7","1","9","10","9"] 
} 

import pandas as pd 
df=pd.DataFrame.from_dict(aa) 

回答

2

,你可以尝试做一个笛卡尔积并将其过滤:

In [32]: df.assign(k=0) \ 
      .merge(df.assign(k=0), on='k', suffixes=['1','2']) \ 
      .query("id1 == parent_id2") 
Out[32]: 
    id1 parent_id1 k id2 parent_id2 
1  1    0 2   1 
6  1    0 7   1 
8  1    0 9   1 
14 2   1 0 3   2 
15 2   1 0 4   2 
40 4   2 0 5   4 
41 4   2 0 6   4 
79 7   1 0 8   7 
105 9   1 0 10   9 
107 9   1 0 12   9 
118 10   9 0 11   10 

PS AFAIK没有高效的模拟Oracle的hierarchical queries在熊猫

相关问题