2015-03-02 105 views
6

我正在使用Python解析英国警察API。我想要的是分析我得到的JSON响应,以计算某个进攻发生的次数。这是来自API的响应示例。计算JSON元素中元素的出现次数

{ 
    category: "anti-social-behaviour", 
    location_type: "Force", 
    location: { 
     latitude: "53.349920", 
     street: { 
      id: 583315, 
      name: "On or near Evenwood Close" 
     }, 
     longitude: "-2.657889" 
    }, 
    context: "", 
    outcome_status: null, 
    persistent_id: "", 
    id: 22687179, 
    location_subtype: "", 
    month: "2013-03" 
}, 

使用此代码

from json import load 
from urllib2 import urlopen 
import json 

url = "http://data.police.uk/api/crimes-street/all-crime?lat=53.396246&lng=-2.646960&date=2013-03" 
json_obj = urlopen(url) 
player_json_list = load(json_obj) 

for player in player_json_list: 
    crimeCategories = json.dumps(player['category'], indent = 2, separators=(',', ': ')) 
    print crimeCategories 

我得到这样

"anti-social-behaviour" 
"anti-social-behaviour" 
"anti-social-behaviour" 
"anti-social-behaviour" 
"drugs" 
"drugs" 
"burglary" 

的响应。如果我改变了我的for循环

for player in player_json_list: 
    crimeCategories = json.dumps(player['category'], indent = 2, separators=(',', ': ')) 
    print crimeCategories.count("drugs") 

然后我得到这样的回应

0 
0 
0 
0 
1 
1 
0 

搜索论坛小时没有帮助我!有任何想法吗?

回答

0

创建一本词典并使用crimeCategories作为键。对于该值,使用一个整数。尝试在你的循环中加入类似的东西。

>>> count['testing'] = count.get('testing',0) + 1 
>>> count['testing'] 
1 
0

你没有在任何地方存储你的计数。基本上你只需要调用当前循环中的物品的数量。

你会希望每个项目添加为在字典中键,然后每一个你打一个发生在你的循环时间

adictionary = {"drugs":0} 
for player in player_list: 
    if adictionary.category 
    adictionary.category += 1 
print adictionary.category 
0

您可以将数据汇总到的映射有增值类别 - > json是这样的:

from collections import defaultdict 

players_by_category = defaultdict(list) 
for player in players_json_list: 
    players_by_category[player['category'].append(player) 

现在你有一个字典,每个类别下的犯罪列表。

因此,要获得某一类的许多罪行是如何发生的一切你需要的是:

for k, v in players_by_category.iteritems(): 
    print "%s: %s" (k, len(v)) 

这是非常令人困惑的使用player代替crime,但无论你认为合适的:)

7

你可以使用一个collections.Counter字典与请求,成为几个简洁的代码行:

import requests 
from collections import Counter 

url = "http://data.police.uk/api/crimes-street/all-crime?lat=53.396246&lng=-2.646960&date=2013-03" 
json_obj = requests.get(url).json() 

c = Counter(player['category'] for player in json_obj) 
print(c) 

输出:

Counter({'anti-social-behaviour': 79, 'criminal-damage-arson': 12, 'other-crime': 11, 'violent-crime': 9, 'vehicle-crime': 7, 'other-theft': 6, 'burglary': 4, 'public-disorder-weapons': 3, 'shoplifting': 2, 'drugs': 2}) 

如果你喜欢有一个正常的字典,然后只需调用字典上的计数器字典:

from pprint import pprint as pp 
c = dict(c) 
pp(c) 
{'anti-social-behaviour': 79, 
'burglary': 4, 
'criminal-damage-arson': 12, 
'drugs': 2, 
'other-crime': 11, 
'other-theft': 6, 
'public-disorder-weapons': 3, 
'shoplifting': 2, 
'vehicle-crime': 7, 
'violent-crime': 9} 

然后,您只需通过按键c['drugs']等..访问

或者遍历要打印的项目犯罪和格式算你想:

for k, v in c.items(): 
    print("{} count: {}".format(k, v) 

输出:

drugs count: 2 
shoplifting count: 2 
other-theft count: 6 
anti-social-behaviour count: 79 
violent-crime count: 9 
criminal-damage-arson count: 12 
vehicle-crime count: 7 
public-disorder-weapons count: 3 
other-crime count: 11 
burglary count: 4 
+0

@martineau,谢谢,它看起来更好;) – 2015-03-02 21:51:26

+0

我个人认为,使用'print(dict(c))'看起来会更好。 – martineau 2015-03-02 21:53:58

+0

@martineau,在下面添加了pprint输出 – 2015-03-02 21:58:48

相关问题