2017-06-15 49 views
0

我有一个嵌套字典,其中列表作为值,格式如下,足够大以至于递归失败。使用一对多关系迭代嵌套字典

aDict = {"R": [ 
      {"A": [ 
       {"B": [ 
        "C", "D" 
       ]} 
      ]}, 
      {"E": [ 
       {"F": [ 
        {"G": ["H"]}, "I" 
       ]} 
      ]} 
    ]} 

我需要遍历字典来添加和更新值;然而,我目前在迭代树时遇到了问题,最终陷入了无限循环。除集合外,我无法在标准库之外使用软件包。 :(

我当前的代码假设父的说法已经在嵌套的字典,而是孩子的说法是没有。

def build_tree(aDict, parent, child, default=None): 
    """""" 
    stack = [iter(aDict.items())] 
    while stack: 
     for k, v in stack[-1]: # loop through keys and values 
      if isinstance(v, dict): 
       stack.append(iter(v.items())) # if v is type dict, append it to stack 
       break 
      elif isinstance(v, list): 
       for elem in v: # if v is list, loop through elements of list 
        if isinstance(v, dict): 
         stack.append(iter(v.items())) 
        elif parent == elem: 
         a_dict = {parent: [child]} # replace elem with a_dict 
         aDict[k].remove(parent) 
         aDict[k].append(a_dict) 
         return default 
        else: 
         pass 
       break 
      elif parent in k: 
       v.append(child) # add child to values list for parent 
       return default 
      elif parent in v: # assumes v is list type 
       a_dict = {parent: [child]} # replace v with a_dict 
       aDict[k].remove(parent) 
       aDict[k].append(a_dict) 
       return default 
    else: 
     stack.pop() 
    return default 

,如果下面的代码被注释掉的功能不进入无限循环,但由于失败列表中嵌套字典的存在。提前

elif isinstance(v, list): 
    for elem in v: # if v is list, loop through elements of list 
     if isinstance(v, dict): 
      stack.append(iter(v.items())) 
     elif parent == elem: 
      a_dict = {parent: [child]} # replace elem with a_dict 
      aDict[k].remove(parent) 
      aDict[k].append(a_dict) 
      return default 
     else: 
      pass 
    break 

谢谢!

+0

你想达到什么目的?这似乎是一个漫长的方式来遍历结构。 – zwer

+0

我不反对。 1)我需要遍历嵌套的字典 2)添加值以列出键是否已经存在 3)如果存在新的关系,例如Z有新的嵌套值,即{Z: [X,Y]} – IMLD

+0

做什么?只是“扁平化”结构(即读取每个非字典元素)? – zwer

回答

1

此功能的非递归遵循在字典/目录结构的路径:

def by_path(data, path): 
    """ 
    data is the dict of lists list structure: {key: [value,...]} where values are same or scalars. 
    path is a sequence of keys in the dictionaries. 
    """ 
    result = None 
    level = [data] # We always pretend to work with a list of dicts. 
    traversed = [] # Purely for error reporting. 
    for key in path: 
     traversed.append(key) 
     next_dicts = [d for d in level if isinstance(d, dict) and key in d] 
     if not next_dicts: 
      raise ValueError('Failed to make next step; traversed so far %r' % traversed) 
     if len(next_dicts) > 1: 
      raise ValueError('Duplicate keys at %r' % traversed) 
     target_dict = next_dicts[0] 
     level = target_dict[key] # Guaranteed to work. 
    # path exhausted. 
    return level # A list/scalar at the end of the path 

它的工作原理就像这样:

>>> by_path(aDict, ['R', 'A', 'B']) 
['C', 'D'] 
>>> by_path(aDict, ['R', 'A', 'wrong', 'path']) 
(traceback elided) 
ValueError: Failed to make next step; traversed so far ['R', 'A', 'wrong'] 

我希望这有助于。

当然,如果你经常遍历相同的长子路径,它可能是值得高速缓存的。如果您更新了缓存,则必须使缓存无效;除非你真的看到高CPU负载并且profiler说它确实是遍历的,否则不要这样做。

2

你可以写一个简单的递归遍历˚F结:

import sys 

# for Python 3.x str is iterable, too, so we'll have to check for cross-version use 
isPY3 = sys.version_info.major > 2 

def traverse(data, level=0): 
    if hasattr(data, "__iter__") and not (isPY3 and isinstance(data, str)): 
     if isinstance(data, dict): # maybe check for MutableMapping, too? 
      for k in data: 
       print("L{}: {}".format(level, k)) # dictionary key 
       traverse(data[k], level + 1) 
     else: 
      for element in data: 
       traverse(element, level + 1) 
    elif data: 
     print("L{}: {}".format(level, data)) # any other value 

将通过您的iterables递归遍历加上保持它目前在平的轨迹(你可以通过其他的东西,以及像父迭代器等),这将打印出(与你的改变数据):

L0: R 
L2: A 
L4: B 
L6: C 
L6: D 
L2: E 
L4: F 
L6: G 
L8: H 
L6: I 

但是你可以做你想要的功能范围内(可以通过删除PY3检查进一步简化它)什么的。然而,对于非常非常深的树,你会达到Python的递归极限 - 但是如果你有这样深的树,你应该重新考虑你的策略/数据结构,因为肯定有更好的方式来表示相同的数据(除非你“重新尝试比无限深的树映射分形)...