2015-04-05 37 views
1

我试图创建我从中所产生的.csv一个JSON文件(D3)如下:Python的CSV以JSON(D3)

uat,soe1.1,deploy-mash-app40-uat,3.8.2.8,org.cgl.kfs.mas.mashonline,mashonline-ui-static 
uat,soe1.1,deploy-mash-app22-uat-has,1.0.1.RC1,org.cgl.kfs.mas.mashonline,realtime_balances_mims_feeder 
stg,soe1.1,deploy-coin-app2-stg,1.1.2,org.mbl.coin.ui.visormobile,vm-web-ui 
stg,soe1.1,deploy-coin-app2-stg,1.2.14,org.mbl.coin.ui.factfind,factfind-web-ui 

试了几种方法,其中包括几乎所有的帖子在StackOverflow中。 D3的JSON,我想有是这样的:

{ 
    "name": "flare", 
    "children": [ 
     { 
      "name": "uat", 
      "children": [ 
       { 
        "name": "soe1.1", 
        "children": [ 
         { 
          "name": "deploy-mash-app40-uat", 
          "children": [ 
           { 
            "name": "mashonline-ui-static", 
            "children": [ 
             { 
              "name": "com.cgl.bfs.mas.mashonline", 
              "size": 3938 
             }, 
             { 
              "name": "3.8.2.8", 
              "size": 3812 
             } 
            ] 
           } 
          ] 
         }, 
         { 
          "name": "deploy-mash-app22-uat-has", 
          "children": [ 
           { 
            "name": "realtime_balances_mims_feeder", 
            "children": [ 
             { 
              "name": "1.0.1.RC1", 
              "size": 3534 
             }, 
             { 
              "name": "com.cgl.bfs.mas.mashonline", 
              "size": 5731 
             } 
            ] 
           } 
          ] 
         } 
        ] 
       } 
      ] 
     }, 
     { 
      "name": "stg", 
      "children": [ 
       { 
        "name": "soe1.1", 
        "children": [ 
         { 
          "name": "deploy-coin-app2-stg", 
          "children": [ 
           { 
            "name": "vm-web-ui", 
            "children": [ 
             { 
              "name": "1.1.2", 
              "size": 3812 
             }, 
             { 
              "name": "com.mbl.coin.ui.visormobile", 
              "size": 6714 
             } 
            ] 
           }, 
           { 
            "name": "factfind-web-ui", 
            "children": [ 
             { 
              "name": "1.2.14", 
              "size": 5731 
             }, 
             { 
              "name": "com.mbl.coin.ui.factfind", 
              "size": 7840 
             } 
            ] 
           } 
          ] 
         } 
        ] 
       } 
      ] 
     } 
    ] 
} 

基本上,有一个最后两倍中的值作为列的兄弟姐妹提前4 谢谢(我是新手python)。

试图 Link1 Link2和其他很多环节的,但是没办法,我可以让它工作

我有它运行的代码如下(由于上述链接之一),但我发现它难以在到达小区时添加“名称”,“儿童”节点。

import json 
import csv 

tree = {} 
name = "name" 
children = "children" 
reader = csv.reader(open("cleaned_new_test.txt", 'rb')) 
reader.next() 
for row in reader: 
    print tree 
    subtree = tree 
    for i, cell in enumerate(row): 
     if cell: 
      if cell not in subtree: 
       subtree[cell] = {} if i<len(row)-1 else 1 
       print subtree 
      subtree = subtree[cell] 

print json.dumps(tree, indent=4) 
+0

D3还可以读取的CSV文件是否有帮助[见这里](https://github.com/mbostock/ d3/wiki/CSV) – 2015-04-05 02:01:00

+0

你从哪里得到'size'? – jedwards 2015-04-05 02:15:08

+0

@jedwards大小是我随机添加的一个值,只是为了符合D3的格式。现在没有任何意义。 – gameshark 2015-04-05 02:26:50

回答

2

这里有一种方法从您的CSV文件才能到JSON:

import csv 
from collections import OrderedDict 
import json 

def fmt(d): 
    l = [] 
    for (k,v) in d.items(): 
     j = OrderedDict() 
     j['name'] = k 
     if isinstance(v, dict): 
      j['children'] = fmt(v) 
     elif isinstance(v, list): 
      for (k,v) in v: 
       j[k] = v 
     l.append(j) 
    return l 

# Build OrderedDict 
d1 = OrderedDict() 
with open('input.txt') as f: 
    reader = csv.reader(f,) 
    for row in reader: 
     print(row) 
     # Extract the columns you want to use as "leaves" 
     leaves = [row[-2], row[-3]] 
     for l in leaves: row.remove(l) 
     # Build a dictionary based on remaining row elements 
     ctx = d1 
     for e in row: 
      if e not in ctx: ctx[e] = OrderedDict() 
      ctx = ctx[e] 
     # Re-insert leaves 
     for l in leaves: 
      ctx[l] = None 

print(json.dumps(d1, indent=4)) 
print('---') 


# Insert missing items (ctx = context) 
ctx = d1['uat']['soe1.1']['deploy-mash-app40-uat']['mashonline-ui-static'] 
ctx['org.cgl.kfs.mas.mashonline'] = [('size', 3938)] 
ctx['3.8.2.8']      = [('size', 3812)] 

ctx = d1['uat']['soe1.1']['deploy-mash-app22-uat-has']['realtime_balances_mims_feeder'] 
ctx['1.0.1.RC1']     = [('size', 3534)] 
ctx['org.cgl.kfs.mas.mashonline'] = [('size', 5731)] 

ctx = d1['stg']['soe1.1']['deploy-coin-app2-stg']['vm-web-ui'] 
ctx['1.1.2']      = [('size', 3812)] 
ctx['org.mbl.coin.ui.visormobile'] = [('size', 6714)] 

ctx = d1['stg']['soe1.1']['deploy-coin-app2-stg']['factfind-web-ui'] 
ctx['1.2.14']      = [('size', 5731)] 
ctx['org.mbl.coin.ui.factfind']  = [('size', 7840)] 

# Wrap "formatted" in another dictionary 
d2 = {"name": "flare", "children": fmt(d1)} 

j = json.dumps(d2, indent=4) 
print(j) 

输出:

 
{ 
    "name": "flare", 
    "children": [ 
     { 
      "name": "uat", 
      "children": [ 
       { 
        "name": "soe1.1", 
        "children": [ 
         { 
          "name": "deploy-mash-app40-uat", 
          "children": [ 
           { 
            "name": "mashonline-ui-static", 
            "children": [ 
             { 
              "name": "org.cgl.kfs.mas.mashonline", 
              "size": 3938 
             }, 
             { 
              "name": "3.8.2.8", 
              "size": 3812 
             } 
            ] 
           } 
          ] 
         }, 
         { 
          "name": "deploy-mash-app22-uat-has", 
          "children": [ 
           { 
            "name": "realtime_balances_mims_feeder", 
            "children": [ 
             { 
              "name": "org.cgl.kfs.mas.mashonline", 
              "size": 5731 
             }, 
             { 
              "name": "1.0.1.RC1", 
              "size": 3534 
             } 
            ] 
           } 
          ] 
         } 
        ] 
       } 
      ] 
     }, 
     { 
      "name": "stg", 
      "children": [ 
       { 
        "name": "soe1.1", 
        "children": [ 
         { 
          "name": "deploy-coin-app2-stg", 
          "children": [ 
           { 
            "name": "vm-web-ui", 
            "children": [ 
             { 
              "name": "org.mbl.coin.ui.visormobile", 
              "size": 6714 
             }, 
             { 
              "name": "1.1.2", 
              "size": 3812 
             } 
            ] 
           }, 
           { 
            "name": "factfind-web-ui", 
            "children": [ 
             { 
              "name": "org.mbl.coin.ui.factfind", 
              "size": 7840 
             }, 
             { 
              "name": "1.2.14", 
              "size": 5731 
             } 
            ] 
           } 
          ] 
         } 
        ] 
       } 
      ] 
     } 
    ] 
} 

这不是最漂亮的,但它能够完成任务。

一些注意事项:

  • 添加size元素之后的事实是丑陋的,有可能是一个更好的方式来做到这一点。 (我指的是以“Insert missing items”为注释开头的代码)。在本节中,您可以指定其他键:值对以添加为列表(键,值)2元组。
  • 这部分可能已被写为:

    # Insert missing items (ctx = context) 
    d1['uat']['soe1.1']['deploy-mash-app40-uat']['mashonline-ui-static']['org.cgl.kfs.mas.mashonline']    = [('size', 3938)] 
    d1['uat']['soe1.1']['deploy-mash-app40-uat']['mashonline-ui-static']['3.8.2.8']         = [('size', 3812)] 
    d1['uat']['soe1.1']['deploy-mash-app22-uat-has']['realtime_balances_mims_feeder']['1.0.1.RC1']     = [('size', 3534)] 
    d1['uat']['soe1.1']['deploy-mash-app22-uat-has']['realtime_balances_mims_feeder']['org.cgl.kfs.mas.mashonline'] = [('size', 5731)] 
    d1['stg']['soe1.1']['deploy-coin-app2-stg']['vm-web-ui']['1.1.2']            = [('size', 3812)] 
    d1['stg']['soe1.1']['deploy-coin-app2-stg']['vm-web-ui']['org.mbl.coin.ui.visormobile']       = [('size', 6714)] 
    d1['stg']['soe1.1']['deploy-coin-app2-stg']['factfind-web-ui']['1.2.14']          = [('size', 5731)] 
    d1['stg']['soe1.1']['deploy-coin-app2-stg']['factfind-web-ui']['org.mbl.coin.ui.factfind']      = [('size', 7840)] 
    

    (不ctx参考的东西)。我只是使用已定义的ctx作为字典结构中的一个位置,然后使用它来设置更深的字典值,以使行更短且更易于管理。

  • 预期的json好多了,但还是有点关闭。也就是说,您指定标识符如com.cgl.bfs.mas.mashonline,但您的csv具有org.cgl.bfs.mas.mashonline(“com”与“org”)。另外,json中的“叶子”元素的顺序不一致。在我的脚本输出的json中,第5列出现在第4列之前。在你的json中,第一个元素以这种方式出现,但最后三个出现时交换顺序(在5之前4)。如果你想要这个交换顺序,变更:

    leaves = [row[-2], row[-3]] 
    

    leaves = [row[-3], row[-2]]