2015-12-02 78 views
0

我想通过邮编和受害者计数按犯罪类型对犯罪总数进行排序。我通过报告编号构建了字典。这是我的数据的一个小样本的输出,当我打印词典:Python:如何对字典数据进行排序和组织

{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']} 

字典被构建为如下:{Report_number:邮政编码,进攻类型,受害者的数]}

I” m全新的编码,我只是学习字典。我将如何通过字典排序来将数据整理为这种格式?

Zip Codes Crime totals 

====================

64126 809 
    64127 3983 

    64128 1749 
    64129 1037 
    64130 4718 
    64131 2080 
    64132 2060 
    64133 2005 
    64134 2928 

任何帮助将非常感激。以下是我的代码到目前为止。我使用大约50,000行数据访问两个文件,所以效率非常重要。

from collections import Counter 

incidents_f = open('incidents.csv', mode = "r") 

crime_dict = dict() 

for line in incidents_f: 
    line_1st = line.strip().split(",") 
    if line_1st[0].upper() != "REPORT_NO": 
     report_no = line_1st[0] 
     offense = line_1st[3] 
     zip_code = line_1st[4] 
     if len(zip_code) < 5: 
      zip_code = "99999" 

     if report_no in crime_dict: 
      crime_dict[report_no].append(zip_code).append(offense) 
     else: 
      crime_dict[report_no] = [zip_code]+[offense] 

#close File 
incidents_f.close 

details_f = open('details.csv',mode = 'r') 
for line in details_f: 
    line_1st = line.strip().split(",") 
    if line_1st[0].upper() != "REPORT_NO": 
     report_no = line_1st[0] 
     involvement = line_1st[1] 
     if involvement.upper() == 'VIC': 
      victims = "VIC" 

     if report_no in crime_dict: 
      crime_dict[report_no].append(victims) 
     else: 
      continue 


#close File 
details_f.close 



print(crime_dict) 
+1

这将有助于如果你可以编辑的问题,包括几个示例行从您的CSV文件。 –

回答

1

这是一种比@更多的代码亚历山大的解决方案来做到这一点:

crime_dict ={ 
    '100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], 
    '20003319': ['64130', '13', 'VIC'], 
    '60077156': ['64130', '18', 'VIC'], 
    '100057708': ['99999', '17', 'VIC', 'VIC'], 
    '40024161': ['64108', '17', 'VIC', 'VIC'] 
    } 

crimes_by_zip = {} 
for k, v in crime_dict.items(): 
    zip = v[0] 
    if zip not in crimes_by_zip.keys(): 
     crimes_by_zip[zip] = 0 
    crimes_by_zip[zip] += 1 

for zip in sorted(crimes_by_zip.keys()): 
    print(zip, crimes_by_zip[zip]) 

64108 1 
64130 3 
99999 1 
+0

谢谢史蒂夫。这工作完美,完全合理。我感谢您的帮助。 – Wakedude

0
D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']} 

data_with_zip_duplicate = [(D[key][0],key) for key in sorted(D.keys(), key = lambda x:D[x][0])] 
print(*data_with_zip_duplicate, sep = "\n") 
+0

谢谢你的帮助。我们还没有开始使用Lambda函数,但在查找此问题时,我已经看到了这些解决方案。我很好奇,lambda函数比Steve提出的zip解决方案更有效率吗? – Wakedude

相关问题