2017-08-28 61 views
1

我需要从一个.json文件导入的数据集中额外添加一个要素。从熊猫数据框中提取字典值

这是什么样子:

f1 = pd.read_json('https://raw.githubusercontent.com/ansymo/msr2013-bug_dataset/master/data/v02/eclipse/short_desc.json') 

print(f1.head()) 


               short_desc 
1  [{'when': 1002742486, 'what': 'Usability issue... 
10  [{'when': 1002742495, 'what': 'API - VCM event... 
100  [{'when': 1002742586, 'what': 'Would like a wa... 
10000 [{'when': 1014113227, 'what': 'getter/setter c... 
100001 [{'when': 1118743999, 'what': 'Create Help Ind... 

从本质上说,我需要“SHORT_DESC”作为列名,并用字符串值正下方填充它:“可用性问题...

到目前为止,我已经试过如下:

f1['desc'] = pd.DataFrame([x for x in f1['short_desc']]) 

Wrong number of items passed 19, placement implies 1 

是否有一个简单的方法来做到这一点,而不使用循环?有人能指出这个新手朝着正确的方向吗?

回答

3

不要初始化数据帧并尝试将其分配给列 - 列应该是pd.Series

你刚才应该直接分配列表中理解,就像这样:

f1['desc'] = [x[0]['what'] for x in f1['short_desc']] 

作为替代,我想提出一个解决方案不涉及任何lambda函数,使用operatorpd.Series.apply

import operator 

f1['desc'] = f1.short_desc.apply(operator.itemgetter(0))\ 
          .apply(operator.itemgetter('what')) 
print(f1.desc.head()) 

1   Usability issue with external editors (1GE6IRL) 
10     API - VCM event notification (1G8G6RR) 
100  Would like a way to take a write lock on a tea... 
10000  getter/setter code generation drops "F" in "..... 
100001 Create Help Index Fails with seemingly incorre... 
Name: desc, dtype: object 
+0

这就是让我疯狂的原因,为什么我们得到了1,10,100等等,没有'short_desc'和列标题。 – JohnWayne360

+0

@ JohnWayne360因为你正在打印一系列作品。尝试'print(df.head())'。你会得到它。 –

+0

@ JohnWayne360有趣的是,当你从网页链接加载它时,该索引似乎就出现了。想要重置它?做'f1 = f1.reset_index(drop = 1)' –

2

或者您可以尝试apply(PS:apply考虑作为时间成本函数)

f1['short_desc'].apply(pd.Series)[0].apply(pd.Series) 

Out[864]: 
                what  when who 
1   Usability issue with external editors (1GE6IRL) 1002742486 21 
10     API - VCM event notification (1G8G6RR) 1002742495 10 
100  Would like a way to take a write lock on a tea... 1002742586 24 
10000 getter/setter code generation drops "F" in "..... 1014113227 331 
100001 Create Help Index Fails with seemingly incorre... 1118743999 9571 
+1

谢谢,如果/当我试图将'when'和'what'匹配起来,这个答案对我有用。非常感激! – JohnWayne360

+0

@ JohnWayne360不客气 – Wen