通过数据循环嵌套字典

我有一个相当难的问题，我只是无法修复.. 这个想法是循环浏览部分数据并找到任何缩进。（总是空格）每一行的缩进比前一个更大，例如4个空格，第一行应该是字典的关键字，并且应该附加下一个值。通过数据循环嵌套字典

如果还有另一个缩进，这意味着应该创建一个带有键和值的新字典。这应该发生递归，直到通过数据。为了让事情更容易理解我做了一个例子：

Chassis 1: 
    Servers: 
     Server 1/1: 
      Equipped Product Name: EEE UCS B200 M3 
      Equiped PID: e63-samp-33 
      Equipped VID: V01 
      Acknowledged Cores: 16 
      Acknowledged Adapters: 1 
    PSU 1: 
     Presence: Equipped 
     VID: V00 
     HW Revision: 0

的想法是能够得到在字典的形式返回数据的任何部分。 dictionary.get（“Chassis 1：”）应该返回所有数据，dictionary.get（“Servers”）应该返回比“Servers”行更深地缩进的所有内容。字典.get（“PSU 1：”）应该给出{“PSU 1：”：“Presence：Equipped”，“VID：100”，“HW Revision：0”}等等。我已经绘制了一个小计划来证明这一点，每种颜色是另一个字典。

当缩进再次变得较小时，例如从8到4个空格时，应将数据附加到数据较少缩进的字典中。

我已经把它在代码试图，但它不来接近我想要的任何地方..

for item in Array: 
    regexpatt = re.search(":$", item) 
    if regexpatt: 
     keyFound = True 
     break 

if not keyFound: 
    return Array 

#Verify if we still have lines with spaces 
spaceFound = False 
for item in Array: 
    if item != item.lstrip(): 
     spaceFound = True 
     break 

if not spaceFound: 
    return Array 

keyFound = False 
key="" 
counter = -1 
for item in Array: 
    counter += 1 
    valueTrim = item.lstrip() 
    valueL = len(item) 
    valueTrimL = len(valueTrim) 
    diff = (valueL - valueTrimL) 
    nextSame = False 
    if item in Array: 
     nextValue = Array[counter] 
     nextDiff = (len(nextValue) - len(nextValue.lstrip())) 
     if diff == nextDiff: 
      nextSame = True 


    if diff == 0 and valueTrim != "" and nextSame is True: 
     match = re.search(":$", item) 
     if match: 
      key = item 
      newArray[key] = [] 
      deptDetermine = True 
      keyFound = True 
    elif diff == 0 and valueTrim != "" and keyFound is False: 
     newArray["0"].append(item) 
    elif valueTrim != "": 
     if depthDetermine: 
      depth = diff 
      deptDetermine = False 
     #newValue = item[-valueL +depth] 
     item = item.lstrip().rstrip() 
     newArray[key].append(item) 

for item in newArray: 
    if item != "0": 
     newArray[key] = newArray[key] 

return newArray

结果应该是这样的，例如：

{ 
    "Chassis 1": { 
     "PSU 1": { 
      "HW Revision: 0", 
      "Presence: Equipped", 
      "VID: V00" 
     }, 
     "Servers": { 
      "Server 1/1": { 
       "Acknowledged Adapters: 1", 
       "Acknowledged Cores: 16", 
       "Equiped PID: e63-samp-33", 
       "Equipped Product Name: EEE UCS B200 M3", 
       "Equipped VID: V01" 
      } 
     } 
    } 
}

我希望这解释了足够的概念

来源

2014-05-05 Yenthe

那么你的代码是做什么的，哪些做得不够精确？ – jonrsharpe

您需要实现一个下推式自动机。 –

jonrsharpe它制作了一个带有键/列表的字典，但没有正确的格式或顺序，所以我选择不包含'结果'，因为它包含相当多的缺陷.. @JoelCornett任何示例或答案？随意张贴一些东西！ – Yenthe

这应该给你你想要的嵌套结构。

如果你想每个嵌套dictonary，也可以从根。取消对if .. is not root部分

def parse(data): 

    root = {} 
    currentDict = root 
    prevLevel = -1 
    parents = [] 
    for line in data: 
     if line.strip() == '': continue 
     level = len(line) - len(line.lstrip(" ")) 
     key, value = [val.strip() for val in line.split(':', 1)] 

     if level > prevLevel and not len(value): 
      currentDict[key] = {} 
      # if currentDict is not root: 
      #  root[key] = currentDict[key] 
      parents.append((currentDict, level)) 
      currentDict = currentDict[key] 
      prevLevel = level 
     elif level < prevLevel and not len(value): 
      parentDict, parentLevel = parents.pop() 
      while parentLevel != level: 
       if not parents: return root 
       parentDict, parentLevel = parents.pop() 
      parentDict[key] = {} 
      parents.append((parentDict, level)) 
      # if parentDict is not root: 
      #  root[key] = parentDict[key] 
      currentDict = parentDict[key] 
      prevLevel = level 
     else: 
      currentDict[key] = value 
    return root 




with open('data.txt', 'r') as f: 
    data = parse(f) 
    #for pretty print of nested dict 
    import json 
    print json.dumps(data,sort_keys=True, indent=4)

输出：

{ 
    "Chassis 1": { 
     "PSU 1": { 
      "HW Revision": "0", 
      "Presence": "Equipped", 
      "VID": "V00" 
     }, 
     "Servers": { 
      "Server 1/1": { 
       "Acknowledged Adapters": "1", 
       "Acknowledged Cores": "16", 
       "Equiped PID": "e63-samp-33", 
       "Equipped Product Name": "EEE UCS B200 M3", 
       "Equipped VID": "V01" 
      } 
     } 
    } 
}

来源

2014-05-05 17:37:22 M4rtini

伟大的解决方案！如果我可以挑选，我不需要使用'str.count'来衡量'level'，只需使用'level = len（line） - len（line.lstrip（“”））''。如果在键或值中存在'：'，则该代码也会中断，但这并不奇怪（大多数解析器要求分隔符不在任何字段中），因此它可能更多地是OP的注释。 –

我相信亚当所指的也正是它为什么会崩溃在我身上？我得到以下错误： “key，value = [val.strip（）for line.split（'：'）] ValueError：需要多个值才能解包” 但是，如果这样做的话可能是我需要的！任何线索为什么我得到这个错误？ – Yenthe

@AdamSmith谢谢，我提出了你的建议编辑。它应该稍微更有效率。我还添加了'line.split（'：'，1）'，所以如果存在多个'：'，它不会中断（但结果可能不合意）。 – M4rtini

这些数据格式确实看起来像YAML。万一有人绊倒这是细带库解决方案：

import yaml 
import pprint 

s = """ 
Chassis 1: 
    Servers: 
     Server 1/1: 
      Equipped Product Name: EEE UCS B200 M3 
      Equiped PID: e63-samp-33 
      Equipped VID: V01 
      Acknowledged Cores: 16 
      Acknowledged Adapters: 1 
    PSU 1: 
     Presence: Equipped 
     VID: V00 
     HW Revision: 0 
""" 

d = yaml.load(s) 
pprint.pprint(d)

输出是：

{'Chassis 1': {'PSU 1': {'HW Revision': 0, 
         'Presence': 'Equipped', 
         'VID': 'V00'}, 
       'Servers': {'Server 1/1': {'Acknowledged Adapters': 1, 
              'Acknowledged Cores': 16, 
              'Equiped PID': 'e63-samp-33', 
              'Equipped Product Name': 'EEE UCS B200 M3', 
              'Equipped VID': 'V01'}}}}

参考：

来源

2014-05-05 18:49:14 moooeeeep

可悲的是我不能安装或使用yaml模块，所以我不得不请求。这将成为我的第二次备份，因为这看起来很不错而且简单。 – Yenthe

@Yenthe我无法想象有什么要求可以阻止你使用这个库。除了安装它之外，您可以将源代码下载到您的代码库（并将该程序包添加到您的PYTHONPATH中）。 – moooeeeep

@mooeeeep可悲的是我在一个非常大的，受限制的公司。不幸的是，我必须要求最大的管理员许可。至少需要一天，如果它获得批准.. – Yenthe

通过数据循环嵌套字典

回答

相关问题