解压JSON和使用熊猫

在其他领域的当前数据扩大推荐我有格式的数据（一些嵌套）JSON的一个领域：解压JSON和使用熊猫

Name  Identifier    Data 
Joe  54872      [{"ref":{"type":4,"id":86669},"side":"Buy","ratio":1},{"ref":{"type":4,"id":80843},"side":"Sell","ratio":1}] 
Jill  84756      [{"ref":{"type":4,"id":75236},"side":"Buy","ratio":1},{"ref":{"type":4,"id":75565},"side":"Sell","ratio":1}]

有没有一种简单的方法，而不是拆包JSON放入它自己的数据框中，然后将它与每个len（n）的固定数据连接起来，其中n是每个json数据帧的长度，以产生以下数据？

Name  Identifier  ref_type  ref_id  side  ratio 
Joe  54872   4    86669  buy  1 
Joe  54872   4    80843  sell  1 
Jill  84756   4    75236  buy  1 
Jill  84756   4    75565  sell  1

谢谢。

来源

2017-10-16 jjm

什么是'打印（DF [ '数据']申请（类型））'？ – jezrael

306 358 360 名称：Data，dtype：object – jjm

您输入的内容是json文件吗？有可能使用我的解决方案来回答？ – jezrael

我认为最好是使用json_normalize：

from pandas.io.json import json_normalize 
import json 

with open('file.json') as data_file:  
    data = json.load(data_file) 

df = json_normalize(data)

编辑：

如果无法使用：

import ast 
from pandas.io.json import json_normalize 

#convert strings to lists and dicts 
df['Data'] = df['Data'].apply(ast.literal_eval) 
#parse Data column 
df1 = pd.concat([json_normalize(x) for x in df['Data'].values.tolist()], keys= df.index) 
#append to original 
df1 = df.drop('Data', 1).join(df1.reset_index(level=1, drop=True)).reset_index(drop=True) 
print (df1) 
    Name Identifier ratio ref.id ref.type side 
0 Joe  54872  1 86669   4 Buy 
1 Joe  54872  1 80843   4 Sell 
2 Jill  84756  1 75236   4 Buy 
3 Jill  84756  1 75565   4 Sell

来源

2017-10-16 07:55:50 jezrael

后者工作得很好。谢谢。 – jjm

解压JSON和使用熊猫

回答

相关问题