2016-01-15 176 views
-2

我从REST的API接收几个JSON回应,我的回应的格式如下: -嵌套JSON到CSV转换

{ 
"headings": [ 
    "ACCOUNT_ID", 
    "date", 
    "FB Likes" 
], 
"rows": [ 
    [ 
    "My Account", 
    "1435708800000", 
    117 
    ], 
    [ 
    "My Account", 
    "1435795200000", 
    99 
    ], 
    [ 
    "My Account", 
    "1435708800000", 
    7 
    ] 
] 
} 

凡列帐户ID,日期和FB_Likes,我试图把它转换成csv,我尝试了许多不同的迭代,但没有成功。

请帮我这个

我的一个使用的脚本都是

with open('Account_Insights_12Jan.json') as fi: 
data = json.load(fi) 

json_array=data 

columns = set() 
for item in json_array: 
    columns.update(set(item)) 

# writing the data on csv 
with open('Test_14Jan.csv', 'w', newline='') as fo: 
writer = csv.writer(fo) 

writer.writerow(list(columns)) 
for item in json_array: 
    row = [] 
    for c in columns: 
     if c in item: row.append(str(item[c])) 
     else: row.append('') 
    writer.writerow(row) 

NI是从它接收错误,我复制它从什么地方,请解释如何将它转换

Hi Again

{ 
"headings": [ 
"POST_ ID", 
"POST_COMMENT_COUNT" 
], 
"rows": [ 
[ 
    { 
    "postId": 188365573, 
    "messageId": 198365562, 
    "accountId": 214, 
    "messageType": 2, 
    "channelType": "TWITTER", 
    "accountType": "TWITTER", 
    "taxonomy": { 
     "campaignId": "2521_4", 
     "clientCustomProperties": { 
     "PromotionChannelAbbreviation": [ 
      "3tw" 
     ], 
     "PromotionChannels": [ 
      "Twitter" 
     ], 
     "ContentOwner": [ 
      "Audit" 
     ], 
     "Location": [ 
      "us" 
     ], 
     "Sub_Category": [ 
      "dbriefs" 
     ], 
     "ContentOwnerAbbreviation": [ 
      "aud" 
     ], 
     "PrimaryPurpose_Outcome": [ 
      "Engagement" 
     ], 
     "PrimaryPurposeOutcomeAbbv": [ 
      "eng" 
     ] 
     }, 
     "partnerCustomProperties": {}, 
     "tags": [], 
     "urlShortnerDomain": "2721_spr.ly" 
    }, 
    "approval": { 
     "approvalOption": "NONE", 
     "comment": "" 
    }, 
    "status": "SENT", 
    "createdDate": 1433331585000, 
    "scheduleDate": 1435783440000, 
    "version": 4, 
    "deleted": false, 
    "publishedDate": 1435783441000, 
    "statusID": "6163465412728176", 
    "permalink": "https://twitter.com/Acctg/status/916346541272498176", 
    "additional": { 
     "links": [] 
    } 
    }, 
    0 
], 
[ 
    { 
    "postId": 999145171, 
    "messageId": 109145169, 
    "accountId": 21388, 
    "messageType": 2, 
    "channelType": "TWITTER", 
    "accountType": "TWITTER", 
    "taxonomy": { 
     "campaignId": "2521_4", 
     "clientCustomProperties": { 
     "PromotionChannelAbbreviation": [ 
      "3tw" 
     ], 
     "Eminence_Registry_Number": [ 
      "1000159" 
     ], 
     "PromotionChannels": [ 
      "Twitter" 
     ], 
     "ContentOwner": [ 
      "Ctr. Health Solutions" 
     ], 
     "Location": [ 
      "us" 
     ], 
     "Sub_Category": [ 
      "fraud" 
     ], 
     "ContentOwnerAbbreviation": [ 
      "chs" 
     ], 
     "PrimaryPurpose_Outcome": [ 
      "Awareness" 
     ], 
     "PrimaryPurposeOutcomeAbbv": [ 
      "awa" 
     ] 
     }, 
     "partnerCustomProperties": {}, 
     "tags": [], 
     "urlShortnerDomain": "2521_spr.ly" 
    }, 
    "approval": { 
     "approvalOption": "NONE", 
     "comment": "" 
    }, 
    "status": "SENT", 
    "createdDate": 1434983660000, 
    "scheduleDate": 1435753800000, 
    "version": 4, 
    "deleted": false, 
    "publishedDate": 1435753801000, 
    "statusID": "616222222198407168", 
    "permalink": "https://twitter.com/Health/status/6162222221984070968", 
    "additional": { 
     "links": [] 
    } 
    }, 
    0 
] 
} 

请考虑这个JSON响应 再次感谢fr所有的帮助,你是救世主!

响应将如下所示。这是一个样本输出,因为有很多列,我只包括其中的几个。我的坏,我不知道如何共享一个Excel输出

帖子ID,邮件ID,帐户ID,为messageType,ACCOUNTTYPE,通道类型
188365573,198365562,214,2,微博,微博


999145171 ,109145169,21388,2,微博,微博

过程中的代码是

csvdata= open('Data_table2.csv', 'w') 
csvwriter = csv.writer(csvdata, delimiter=',') 
csvwriter.writerow(header) 


for i in range(0,70): 
    csvwriter.writerow(data1["rows"][i][0].values()) 

csvdata.close() 

但没有成功的工作,因为男,任何嵌套版本,也在一些响应中,我们有一些标题需要检查,如果它不在那里,然后为此创建一个新标题

再次感谢所有帮助! 马努

+0

定义什么结果应该看起来像第一位的。 – deceze

回答

1

首先,安装熊猫:

pip install pandas 

然后,用熊猫来使用你从响应获取的数据创建一个数据帧的对象。创建对象时,您可以将其转换为csv或xls文件,并设置'index = False'以防止将索引添加到输出文件中。

import pandas as pd 
import json 

with open('data_new.json') as fi: 
    data = json.load(fi) 
    df = pd.DataFrame(data=data['rows'],columns=data['headings']) 
    df.to_csv('data_table.csv', index=False) 

输出例如:

ACCOUNT_ID,date,FB Likes 
My Account,1435708800000,117 
My Account,1435795200000,99 
My Account,1435708800000,7 
+0

仅由代码组成的解答,没有解释,被认为没那么有用。请考虑添加一些说明文字。 –

+0

您可能需要先安装熊猫。做到这一点:'$ pip install pandas' –

+0

嗨罗马 - 它只是像魔术一样工作,我无话可说要感谢你,我一直在寻找这个无处不在!再次感谢。但是,这是我的问题开始的地方,我有一些更深的嵌套API Json响应,甚至不适用于此代码。如果可能的话,我会请求看看上面编辑过的我的问题,我也包含了其他Json响应。你是一个救世主 –

0

错过了蟒蛇的要求,但如果你愿意来调用外部程序,这将仍然正常工作。 请注意,这需要jq> = 1.5。

cat YourJsonFile | jq -r ' [ .rows[][0] | to_entries | map(.key), map(.value | tostring) ] | .[0,range(1;length;2)]|@csv' 

# Lets break it down 
jq -r # disable escaping and quoting 
    ' [ # this will add create an array 
     .rows[][0] # select rows (from object, [] it's array, 
       # and [0] first element in that array) 
     | to_entries # convert it to key, value object 
     | map(.key), map(.value | tostring) # select key and value 
          # (value is converted to string) 
          # this is the step that needs '-r' option to jq 
     ] # close array. We now have alternating "header" and "data" rows 
     | .[0,range(1;length;2)] # select from current (.), first (0) and 
           # with range function every second row 
           # starting from one 
     |@csv # convert resulting json to csv 
     '  # Done 

https://stedolan.github.io/jq/