2016-05-22 81 views
-1

Python中,我试着拿一个XML文件,处理它,然后输出数据到JSON。 XML处理工作正常,但我无法正确格式化JSON。输出文件看起来更像是一个内含字典的列表,这很有意义,因为代码实际上是这样做的。我怎样才能使这个正确的JSON文件?将JSON写入文件 - 格式错误?

filename = 'data.json' 

d = [] 

for elem in ET.fromstring(data).findall('.//table/row'): 
     field1 = elem.get('field1') 
     field2 = elem.get('field2') 
     field3 = elem.get('field3') 
     field4 = elem.get('field4') 
     l = {'field1' : field1, 
      'field2' : field2, 
      'field3' : field3, 
      'field4' : field4} 
     d.append(l) 

f_out = open(filename, 'w') 
json.dump(d, f_out) 

f_out.close() 

输出文件看起来是这样的:

[{"field1": "field1", "field2": "field2", "field3": "field3", "field4": "field4"}, ... {"field1": "field1", "field2": "field2", "field3": "field3", "field4": "field4"}] 

当我希望它看起来像:

{"field1": "field1", "field2": "field2", "field3": "field3", "field4": "field4"}, ... {"field1": "field1", "field2": "field2", "field3": "field3", "field4": "field4"} 
+0

当你那样json.dump尝试添加缩进 – glls

+0

您的代码看起来不错。我怀疑输出是一个“正确的JSON文件”。请显示你从这个程序中得到的输出,以及你期望的输出。 –

+0

您正在编写'd',它是一个列表,用JSON表示为'[item1,item2,item3]'。 –

回答

1

根据the AWS docs,红移COPY命令期望在其输入文件JSON对象的序列,和序列JSON对象的可选JSONPath文件中。

要创建这样一个序列,叫json.dump()多次:

from xml.etree import ElementTree as ET 
import json 


data = ''' 
<root><table> 
    <row field1="a" field2="b" field3="c" field4="d"/> 
    <row field1="1" field2="2" field3="3" field4="4"/> 
</table></root>''' 

filename = 'data.json' 
f_out = open(filename, 'w') 
for elem in ET.fromstring(data).findall('.//table/row'): 
     field1 = elem.get('field1') 
     field2 = elem.get('field2') 
     field3 = elem.get('field3') 
     field4 = elem.get('field4') 
     l = {'field1' : field1, 
      'field2' : field2, 
      'field3' : field3, 
      'field4' : field4} 
     json.dump(l, f_out) 
     f_out.write('\n') 

f_out.close() 

结果:

{"field2": "b", "field3": "c", "field1": "a", "field4": "d"} 
{"field2": "2", "field3": "3", "field1": "1", "field4": "4"} 
0

json.dump()有一凹陷,哪些是你可能要一个分隔符参数如果你的json文件应该是可读的。

实施例:

json.dump({'1': 2, '3': 4}, f_out, indent=4, separators=(',', ': ')) 

结果:

{ 
    "1": 2, 
    "3": 4 
} 

参见https://docs.python.org/2/library/json.html#basic-usage