2016-06-08 43 views
2

我有一个非常大的json,有多个字段,我只想提取其中的一些,然后将它们写入csv。提取json字段并将它们写入python的csv中

这里是我的代码:

#!/usr/bin/python3 
# -*- coding: utf-8 -*- 

import json 

import csv 

data_file = open("book_data.json", "r") 
values = json.load(data_file) 
data_file.close() 

with open("book_data.csv", "wb") as f: 
    wr = csv.writer(f) 
    for data in values: 
     value = data["identifier"] 
     value = data["authors"] 
     for key, value in data.iteritems(): 
       wr.writerow([key, value]) 

它给我这个错误:

File "json_to_csv.py", line 22, in <module> 
wr.writerow([key, value]) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 8: ordinal not in range(128) 

但我给顶部的UTF-8编码,所以我不知道什么是错在那里。

感谢

+0

在这行是错误? – pinturic

+1

文件“json_to_csv.py”,第22行,在 wr.writerow([key,value]) 我会补充一点。 –

+1

尝试https://github.com/jdunck/python-unicodecsv – ravigadila

回答

3

你需要对数据进行编码:

wr.writerow([key.encode("utf-8"), value.encode("utf-8")]) 

的差别相当于:

In [8]: print u'\u2019'.encode("utf-8") 
’ 

In [9]: print str(u'\u2019') 
--------------------------------------------------------------------------- 
UnicodeEncodeError      Traceback (most recent call last) 
<ipython-input-9-4e3ad09ee31b> in <module>() 
----> 1 print str(u'\u2019') 

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128) 

如果你有字符串和列表和值的混合物,可以使用issinstance检查你有什么,如果你有一个列表迭代和编码:

with open("book_data.csv", "wb") as f: 
    wr = csv.writer(f) 
    for data in values: 
     for key, value in data.iteritems(): 
       wr.writerow([key, ",".join([v.encode("utf-8") for v in value]) if isinstance(value, list) else value.encode("utf8")]) 

只写了三列creator, contributoridentifier,使用钥匙随便拉数据:

import csv 

with open("book_data.csv", "wb") as f: 
    wr = csv.writer(f) 
    for dct in values: 
     authors = dct["authors"] 
     wr.writerow((",".join(authors["creator"]).encode("utf-8"), 
        "".join(authors["contributor"]).encode("utf-8"), 
        dct["identifier"].encode("utf-8"))) 
+0

谢谢!这有效,现在它给我: 文件“json_to_csv.py”,第22行,在 wr.writerow([key.encode(“utf-8”),value.encode(“utf-8”)) ]) AttributeError:'list'对象没有属性'encode' –

+0

你有一些值是列表吗? –

+0

我在json中有如下值: “authors”:{“contributor”:[“FORBES,Walter。”],“creator”:[“AA”]} –

相关问题