2014-04-01 125 views
1

我正在尝试构建一个将数据格式化为由D3使用的JSON字符串的Python函数。用Python创建D3嵌套JSON数据

我需要它的格式为:每

{ 
"name": "flare", 
"children": [ 
    { 
    "name": "analytics", 
    "children": [ 
    { 
    "name": "cluster", 
    "children": [ 
     {"name": "AgglomerativeCluster", "size": 3938}, 
     {"name": "CommunityStructure", "size": 3812}, 
     {"name": "HierarchicalCluster", "size": 6714}, 
     {"name": "MergeEdge", "size": 743} 
    ] 
    }, 

http://bl.ocks.org/mbostock/4063550 此类型:http://johan.github.io/d3/ex/tree.html

我已经想出到目前为止就像是一个数据结构:

{'nlp':{'course':['course','range','topics','language','processing','word']}} 

并需要它出来,如:

{ 
    "name":"Natural Language Processing", 
    "children":[ 
     { 
     "name":"course", 
     "children":[ 
      { 
       "name":"course", 
       "size":700 
      }, 
      { 
       "name":"range", 
       "size":700 
      }, 
      { 
       "name":"topics", 
       "size":700 
      }, 
      { 
       "name":"language", 
       "size":700 
      }, 
      { 
       "name":"processing", 
       "size":700 
      }, 
      { 
       "name":"word", 
       "size":700 
      } 
     ] 
     } 
    ] 
} 

,并开始向下

def format_d3_circle(data_input): 
    d3_data = {}; 
    #root level 
    d3_data['name'] = data_input[data_input.keys()[0]].keys()[0] 
    sub_levels = data_input[data_input.keys()[0]] 
    for level_one_key, level_one_data in sub_levels: 
     d3_data['children'] = [] 
    return json.dumps(d3_data) 

的道路,但似乎我无法正常接近的问题,我发现很难有效地想象一个很好的解决方案,用于创建JSON的节点,因为它是。

关于如何抽象这个问题,并建立任何我需要从字典/列表/ JSON输入等嵌套的JSON结构的任何建议?

+0

您可以应用[D3的嵌套函数](https://github.com/mbostock/d3/wiki/Arrays#-nest)来完成这项工作,或者至少借用这些概念。 – FernOfTheAndes

+0

我使用D3函数进行了研究;很高兴知道他们在那里,但并不完全清楚他们的工作方式。 而且我认为最好是构建JSON字符串服务器端以便更快地显示;无需处理客户端。 我在想我至少可以做字符串连接,但这看起来像一个黑客攻击。我必须看看我能想出什么。 – jmhead

回答

1

这是我一直在研究的解决方案,它可以处理任意数量级别的一般情况下的表格输入数据。

import pandas as pd 
import json 

def find_element(children_list,name): 
    """ 
    Find element in children list 
    if exists or return none 
    """ 
    for i in children_list: 
     if i["name"] == name: 
      return i 
    #If not found return None 
    return None 

def add_node(path,value,nest): 
    """ 
    The path is a list. Each element is a name that corresponds 
    to a level in the final nested dictionary. 
    """ 

    #Get first name from path 
    this_name = path.pop(0) 

    #Does the element exist already? 
    element = find_element(nest["children"], this_name) 

    #If the element exists, we can use it, otherwise we need to create a new one 
    if element: 

     if len(path)>0: 
      add_node(path,value, element) 

    #Else it does not exist so create it and return its children 
    else: 

     if len(path) == 0: 
      nest["children"].append({"name": this_name, "value": value}) 
     else: 
      #Add new element 
      nest["children"].append({"name": this_name, "children":[]}) 

      #Get added element 
      element = nest["children"][-1] 

      #Still elements of path left so recurse 
      add_node(path,value, element) 

下面是一个如何使用它的例子。您必须告诉它哪些列将用作层次结构的级别以及哪个列存储值。

df = pd.read_json('{"l1":{"0":"a","1":"a","2":"a","3":"a","4":"b","5":"b","6":"b","7":"b"},"l2":{"0":"a1","1":"a1","2":"a2","3":"a2","4":"b1","5":"b1","6":"b2","7":"b3"},"l3":{"0":"a11","1":"a12","2":"a21","3":"a22","4":"b11","5":"b12","6":"b22","7":"b34"},"val":{"0":1,"1":2,"2":3,"3":4,"4":5,"5":6,"6":7,"7":8}}') 


d = {"name": "root", 
"children": []} 

levels = ["l1","l2", "l3"] 
for row in df.iterrows(): 
    r = row[1] 
    path = list(r[levels]) 
    value = r["val"] 
    add_node(path,value,d) 

print json.dumps(d, sort_keys=False, 
       indent=2)