2016-10-26 44 views
0

此问题与this one有关。这一次我想更进一步。给定一个字典,如:使用嵌套字典创建多索引`DataFrame`

dd = {0: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}, 

     1: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}} 

或类似的列表:

ll = [{"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}, 

     {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}, 
      "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}] 

我想构造一个DataFrame,如:

      russell       godel      cantor 
        score ping     score ping     score ping 
0  0.17473916938994682  40  0.3443303845926545  47 0.43576522521017247  42 
1  0.7341005512329682  22  0.14682222267827938  81 0.5662517436162526  59 

这里我们可以看到列索引是MultiIndex。有没有办法实现这一点?如果我尝试pandas.DataFrame.from_dict(dd, orient="index")pandas.DataFrame(ll)然后我得到:

         russell          godel          cantor 
0 {'score': 0.17473916938994682, 'ping': 40} {'score': 0.3443303845926545, 'ping': 47} {'score': 0.43576522521017247, 'ping': 42} 
1 {'score': 0.7341005512329682, 'ping': 22} {'score': 0.14682222267827938, 'ping': 81} {'score': 0.5662517436162526, 'ping': 59} 

这不是我想要的。

回答

1

现在是更复杂,但与Paneltransposeto_frameunstack可以帮助:

df = pd.Panel(dd).transpose(2,0,1).to_frame().unstack() 
print (df) 
     cantor   godel   russell   
minor ping  score ping  score ping  score 
major             
0  69.0 0.050641 51.0 0.765994 20.0 0.935196 
1  91.0 0.398624 33.0 0.408681 75.0 0.464876 
+0

你亲自帮助我很多熊猫。非常感谢你。 – Ray

1

这也将工作。请注意,您的嵌套字典并非真正嵌套以便于翻译。

pd.concat({key:pd.DataFrame(dd[key]) for key in dd.keys()}).unstack() 
Out[104]: 
    cantor   godel   russell   
    ping  score ping  score ping  score 
0 73.0 0.463084 94.0 0.954662 76.0 0.732291 
1 28.0 0.778905 81.0 0.984285 36.0 0.094173 

简而言之,用concat创建多索引df是非常容易的。你只需要一个数据帧字典