2016-03-08 90 views
0

我有一个JSON数据:嵌套柜台JSON数据

{ 
    "persons": [ 
     { 
      "city": "Seattle", 
      "name": "Brian" 
      "dob" : "19-03-1980" 
     }, 
     { 
      "city": "Amsterdam", 
      "name": "David" 
      "dob" : "19-09-1979" 
     } 
     { 
      "city": "London", 
      "name": "Joe" 
      "dob" : "19-01-1980" 
     } 
     { 
      "city": "Kathmandu", 
      "name": "Brian" 
      "dob" : "19-03-1980" 
     } 
    ] 
} 

我怎么能算的单个元素一样,出生在一月至十二月(0,如果没有出生)的人数量和出生于给定一年在单个迭代中使用python。此外,在每个月 像注册的唯一名称数量:

1980 :3 
--Jan:1 
--Mar:2 
1979 :1 
--Sep:1 

名称:

Mar 1980: 1 #Brian is same for both cities 
Jan 1980: 1 
Sep 1979: 1 

counters_mon是有一年的特定月份

for k_mon,v_mon in counters_mon.items(): 
    print('{}={}'.format(k_mon,v_mon)) 

但我值计数器想要打印细节。我怎样才能做到这一点?

+1

首先,你的json无效(http://jsonlint.com/),既然你可以调用'counters_mon.items()',你可能正在使用字典。 – jDo

+0

对,首先我提取这些值并将它们添加到列表中,然后对它们执行操作。 – HunterrJ

回答

0
import json  

f = open('/path/to/your/json', 'r') 
persons = json.load(f) 
years_months = {} 
years_months_names = {} 

for person in persons['persons']: 
    year = person['dob'][-4:] 
    month = person['dob'][3:5] 
    month_year = month + ' ' + year 
    name = person['name'] 

    if year not in years_months.keys(): 
     years_months[year] = { 'count': 1, 'months' : {} } 
     if month not in years_months[year]['months'].keys(): 
      years_months[year]['months'][month] = 1 
     else: 
      years_months[year]['months'][month] += 1 
    else: 
     years_months[year]['count'] += 1 
     if month not in years_months[year]['months'].keys(): 
      years_months[year]['months'][month] = 1 
     else: 
      years_months[year]['months'][month] += 1 

    if month_year not in years_months_names.keys(): 
     years_months_names[month_year] = set([name]) 
    else: 
     years_months_names[month_year].add(name) 

for k, v in years_months.items(): 
    print(k + ': ' + str(v['count'])) 
    for month, count in v['months'].items(): 
     print("-- " + str(month) + ": " + str(count)) 
for k, v in years_months_names.items(): 
    print(k + ": " + str(len(v))) 

我假设你有你的json的路径。我也在你发布的JSON上测试了我的答案,并且要小心确保你的JSON结构正确。

+0

〜谢谢你的回答,我会试试办公室系统,如果有什么东西就回来。 – HunterrJ

+0

〜接受了答案,但它没有显示1979年9月的计数,并将3月计数为1,但实际上1980年为2年。 – HunterrJ

+0

检查JSON的结构。我只是用上面的代码再次测试它,输出是正确的。我也在运行Python 3. –

0

这是使用defaultdicts(https://docs.python.org/3/library/collections.html#collections.defaultdict)的好例子。

data # assume you have your data in a var called data 

from collections import defaultdict 
from calendar import month_abbr 

# slightly strange construction here but we want a 2 levels of defaultdict followed by lists 
aggregate = defaultdict(lambda:defaultdict(list)) 

# then the population is super simple - you'll end up with something like 
# aggregate[year][month] = [name1, name2] 
for person in data['persons']: 
    day, month, year = map(int, person['dob'].split('-')) 
    aggregate[year][month].append(person['name']) 


# I'm sorting in chronological order for printing 
for year, months in sorted(aggregate.items()): 
    print('{}: {}'.format(year, sum(len(names) for names in months.values()))) 
    for month, names in sorted(months.items()): 
     print('--{}: {}'.format(month_abbr[month], len(names))) 

for year, months in sorted(aggregate.items()): 
    for month, names in sorted(months.items()): 
     print('{} {}: {}'.format(month_abbr[month], year, len(set(names)))) 

根据数据将如何使用其实我认为没有在聚集了复杂的嵌套,而是选择像aggregate[(year, month)] = [name1, name2,...]。我发现我的数据嵌套越多,使用它就越困惑。

编辑或者,您可以在第一遍上创建多个结构,以便简化打印步骤。再次,我使用defaultdict清理所有配置。

agg_years = defaultdict(lambda:defaultdict(int)) # [year][month] = counter 
agg_years_total = defaultdict(int) # [year] = counter 
agg_months_names = defaultdict(set) # [(year, month)] = set(name1, name2...) 

for person in data['persons']: 
    day, month, year = map(int, person['dob'].split('-')) 

    agg_years[year][month] += 1 
    agg_years_total[year] += 1 
    agg_months_names[(year, month)].add(person['name']) 


for year, months in sorted(agg_years.items()): 
    print('{}: {}'.format(year, agg_years_total[year])) 
    for month, quant in sorted(months.items()): 
     print('--{}: {}'.format(month_abbr[month], quant)) 

for (year, month), names in sorted(agg_months_names.items()): 
    print('{} {}: {}'.format(month_abbr[month], year, len(names)))