从python中的缩进文本文件创建树/深层嵌套字典

基本上，我想遍历一个文件，并将每行的内容放入一个深层嵌套的字典中，其结构由多余的空格在每一行的开始。从python中的缩进文本文件创建树/深层嵌套字典

本质上的目的是采取这样的：

，并把它弄成这个样子：

{"a":{"b":"c","d":"e"}}

或者这样：

apple 
    colours 
     red 
     yellow 
     green 
    type 
     granny smith 
    price 
     0.10

到这一点：

{"apple":{"colours":["red","yellow","green"],"type":"granny smith","price":0.10}

这样我就可以将它发送到Python的JSON模块并制作一些JSON。

目前我正在试图让一个字典和像这样的步骤列表：

{"a":""} ["a"]
{"a":"b"} ["a"]
{"a":{"b":"c"}} ["a","b"]
{"a":{"b":{"c":"d"}}}} ["a","b","c"]
{"a":{"b":{"c":"d"},"e":""}} ["a","e"]
{"a":{"b":{"c":"d"},"e":"f"}} ["a","e"]
{"a":{"b":{"c":"d"},"e":{"f":"g"}}} ["a","e","f"]

等等

名单就像“面包屑”表示在我上次放在一个字典。

要做到这一点，我需要一种方法来遍历列表并生成类似dict["a"]["e"]["f"]的东西来得到最后一个字典。我有一个看看类自动激活，有人做出了看起来非常有用但我真的不确定的：

无论我使用这个正确的数据结构（我打算送它到JSON库来创建一个JSON对象）
如何在这种情况下使用自动授权
是否有更好的方法来解决这个问题。

我想出了下面的功能，但它不工作：

def get_nested(dict,array,i): 
if i != None: 
    i += 1 
    if array[i] in dict: 
     return get_nested(dict[array[i]],array) 
    else: 
     return dict 
else: 
    i = 0 
    return get_nested(dict[array[i]],array)

将不胜感激帮助！

（我非常不完整的代码的其余部分是在这里:)

#Import relevant libraries 
import codecs 
import sys 

#Functions 
def stripped(str): 
    if tab_spaced: 
     return str.lstrip('\t').rstrip('\n\r') 
    else: 
     return str.lstrip().rstrip('\n\r') 

def current_ws(): 
    if whitespacing == 0 or not tab_spaced: 
     return len(line) - len(line.lstrip()) 
    if tab_spaced: 
     return len(line) - len(line.lstrip('\t\n\r')) 

def get_nested(adict,anarray,i): 
    if i != None: 
     i += 1 
     if anarray[i] in adict: 
      return get_nested(adict[anarray[i]],anarray) 
     else: 
      return adict 
    else: 
     i = 0 
     return get_nested(adict[anarray[i]],anarray) 

#initialise variables 
jsondict = {} 
unclosed_tags = [] 
debug = [] 

vividfilename = 'simple.vivid' 
# vividfilename = sys.argv[1] 
if len(sys.argv)>2: 
    jsfilename = sys.argv[2] 
else: 
    jsfilename = vividfilename.split('.')[0] + '.json' 

whitespacing = 0 
whitespace_array = [0,0] 
tab_spaced = False 

#open the file 
with codecs.open(vividfilename,'rU', "utf-8-sig") as vividfile: 
    for line in vividfile: 
     #work out how many whitespaces at start 
     whitespace_array.append(current_ws()) 

     #For first line with whitespace, work out the whitespacing (eg tab vs 4-space) 
     if whitespacing == 0 and whitespace_array[-1] > 0: 
      whitespacing = whitespace_array[-1] 
      if line[0] == '\t': 
       tab_spaced = True 

     #strip out whitespace at start and end 
     stripped_line = stripped(line) 

     if whitespace_array[-1] == 0: 
      jsondict[stripped_line] = "" 
      unclosed_tags.append(stripped_line) 

     if whitespace_array[-2] < whitespace_array[-1]: 
      oldnested = get_nested(jsondict,whitespace_array,None) 
      print oldnested 
      # jsondict.pop(unclosed_tags[-1]) 
      # jsondict[unclosed_tags[-1]]={stripped_line:""} 
      # unclosed_tags.append(stripped_line) 

     print jsondict 
     print unclosed_tags 

print jsondict 
print unclosed_tags

来源

2013-07-25 Tomcat

我不得不引用[的Python禅]（http://www.python.org/dev/peps/pep-0020/）“扁平比嵌套更好“。我会改变你如何做这件事。总比嵌套字典有更好的方法。此外，请确保你没有陷入[X Y问题]（http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem）。 –

我最初的做法很简单，就是使用各种规则生成一个很长的字符串。那会更好吗？ – Tomcat

这取决于你想要达到的目标，看看[XY问题]（http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem），并确保你是不会犯类似的错误。本质上，你需要弄清楚你的数据是什么，并围绕它建立你的容器，而不是建立一个容器，并找出如何把你的数据放入它。每种类型的容器都有其优点，但使用字符串来存储不同的数据集从来都不是一个好主意。 –

这里是一个递归解决方案。首先，按以下方式转换输入。

输入：

person: 
    address: 
     street1: 123 Bar St 
     street2: 
     city: Madison 
     state: WI 
     zip: 55555 
    web: 
     email: [email protected]

第一步输出：

[{'name':'person','value':'','level':0}, 
{'name':'address','value':'','level':1}, 
{'name':'street1','value':'123 Bar St','level':2}, 
{'name':'street2','value':'','level':2}, 
{'name':'city','value':'Madison','level':2}, 
{'name':'state','value':'WI','level':2}, 
{'name':'zip','value':55555,'level':2}, 
{'name':'web','value':'','level':1}, 
{'name':'email','value':'[email protected]','level':2}]

这是很容易与split(':')和通过计数前导制表符的数量来完成：

def tab_level(astr): 
    """Count number of leading tabs in a string 
    """ 
    return len(astr)- len(astr.lstrip('\t'))

再喂第一步输出成以下功能：

def ttree_to_json(ttree,level=0): 
    result = {} 
    for i in range(0,len(ttree)): 
     cn = ttree[i] 
     try: 
      nn = ttree[i+1] 
     except: 
      nn = {'level':-1} 

     # Edge cases 
     if cn['level']>level: 
      continue 
     if cn['level']<level: 
      return result 

     # Recursion 
     if nn['level']==level: 
      dict_insert_or_append(result,cn['name'],cn['value']) 
     elif nn['level']>level: 
      rr = ttree_to_json(ttree[i+1:], level=nn['level']) 
      dict_insert_or_append(result,cn['name'],rr) 
     else: 
      dict_insert_or_append(result,cn['name'],cn['value']) 
      return result 
    return result

其中：

def dict_insert_or_append(adict,key,val): 
    """Insert a value in dict at key if one does not exist 
    Otherwise, convert value to list and append 
    """ 
    if key in adict: 
     if type(adict[key]) != list: 
      adict[key] = [adict[key]] 
     adict[key].append(val) 
    else: 
     adict[key] = val

来源

2014-07-26 01:05:24 kalu

你可以提供代码来翻译输入到'第一步输出'？谢谢。 –

如果有人感兴趣，我创建了一个类似的[C＃实现]（http://stackoverflow.com/a/36998605/107625）。 –

这是[高度相关的问题]（http://stackoverflow.com/questions/38664465/creating-a-tree-deeply-nested-dict-with-lists-from-an-indented-text-file）。有什么机会可以帮忙？ – zelusp

首先，不使用array和dict作为变量名，因为它们是在Python保留字和重用他们可能结束各种各样的混乱。

好的，如果我正确地得到了你，你在文本文件中给出了一棵树，父母身份由缩进表示，并且你想恢复实际的树结构。对？

以下看起来像一个有效的大纲？因为我无法将当前的代码放入上下文中。

result = {} 
last_indentation = 0 
for l in f.xreadlines(): 
    (c, i) = parse(l) # create parse to return character and indentation 
    if i==last_indentation: 
    # sibling to last 
    elif i>last_indentation: 
    # child to last 
    else: 
    # end of children, back to a higher level

OK，然后你的列表是当前的父母，这是其实正确的 - 但我让他们指出你所创建的字典，而不是字面信

刚开始有些东西在这里

result = {} 
parents = {} 
last_indentation = 1 # start with 1 so 0 is the root of tree 
parents[0] = result 
for l in f.xreadlines(): 
    (c, i) = parse(l) # create parse to return character and indentation 
    if i==last_indentation: 
     new_el = {} 
     parents[i-1][c] = new_el 
     parents[i] = new_el 
    elif i>last_indentation: 
    # child to last 
    else: 
    # end of children, back to a higher level

来源

2013-07-25 13:06:36 Nicolas78

是的，这是完全正确的。 – Tomcat

好的，然后让我添加一些东西... – Nicolas78

谢谢！如果json.dumps采取了不是字典的格式，我会更快乐：P – Tomcat

以下代码将采用块缩进文件并转换为XML树;这样的：

foo 
bar 
baz 
    ban 
    bal

...变为：

<cmd>foo</cmd> 
<cmd>bar</cmd> 
<block> 
    <name>baz</name> 
    <cmd>ban</cmd> 
    <cmd>bal</cmd> 
</block>

的基本方法是：

设置缩进0
对于每一行，得到缩进
如果>当前，降低并将当前块/标识保存在堆栈上
如果==电流，附加到当前块
如果<目前，流行从堆栈，直到你得到匹配的缩进

所以：

from lxml import builder 
C = builder.ElementMaker() 

def indent(line): 
    strip = line.lstrip() 
    return len(line) - len(strip), strip 

def parse_blockcfg(data): 
    top = current_block = C.config() 
    stack = [] 
    current_indent = 0 

    lines = data.split('\n') 
    while lines: 
     line = lines.pop(0) 
     i, line = indent(line) 

     if i==current_indent: 
      pass 

     elif i > current_indent: 
      # we've gone down a level, convert the <cmd> to a block 
      # and then save the current ident and block to the stack 
      prev.tag = 'block' 
      prev.append(C.name(prev.text)) 
      prev.text = None 
      stack.insert(0, (current_indent, current_block)) 
      current_indent = i 
      current_block = prev 

     elif i < current_indent: 
      # we've gone up one or more levels, pop the stack 
      # until we find out which level and return to it 
      found = False 
      while stack: 
       parent_indent, parent_block = stack.pop(0) 
       if parent_indent==i: 
        found = True 
        break 
      if not found: 
       raise Exception('indent not found in parent stack') 
      current_indent = i 
      current_block = parent_block 

     prev = C.cmd(line) 
     current_block.append(prev) 

    return top

来源

2014-03-14 15:40:04 Realist

从python中的缩进文本文件创建树/深层嵌套字典

回答

相关问题