2017-02-23 79 views
0

我想编写的代码,将数据以这种格式元素字典

数据例如:

[['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH': 7.27}], 
['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}], 
['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}], 
['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}], 
['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}], 
['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}], 
['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}], 
['12319825', '40686', {'pH': 8.17}], 
['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}], 
['12319825', '40809', {'pH': 8.42}], 
['12319825', '40827', {'pH': 8.46}], 
['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}], 
['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}], 
['12319825', '41080', {'pH': 8.4}], 
['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['12319825', '41134', {'pH': 8.67}]] 

,并会返回一个字典,其中的关键是污染物(以这种情况下,无论是pH值还是大肠杆菌),这个值就是我所称的DateList。日期列表将是格式(日期,T/F)的每个数据点的列表元组。如果该值在给定范围之外的布尔将为真,或在给定的值(取决于标准型)

rangeCriteria={'pH':(5.0,9.0)} 
convCriteria={'Echerichia coli':320) 

现在,当运行此代码,每个字典具有用于这两个值

def testLocationForConv(DataFromLocation): 
#checks if a pollutant is outside of acceptable values. 
#A dictionary is created where each pollutant has a cooresponding list of tuples 
#with the date and a corresponding boolean to say if it is in or out of 
#the criteria (true if out false if in) 
#It handles when the criteria is a minimum or range rather than a 
#maximum 

dateList=[] 
impairedList=[] 
overDict=dict() 
for date in DataFromLocation: 
    for pollutant in date[2]: 
     if pollutant in conventionalCriteriaList: 
      dateList.append((date[1],date[2][pollutant]>convCriteria[pollutant])) 
      overDict[pollutant]=dateList 
     if pollutant in rangeCriteria: 
      overDict[pollutant]=dateList 
      dateList.append((date[1], (not (float(date[2][pollutant])>rangeCriteria[pollutant][0] and float(date[2][pollutant])<rangeCriteria[pollutant][1])))) 
     #if pollutant in minCriteriaList: 
     # overDict[pollutant]=dateList 
      # dateList.append((date[1],date[2][pollutant]<minCriteria[pollutant]) 

     else: 
      pass 
print overDict 

现在,两种污染物的数据点都添加到词典中,得到以下结果。

{'pH': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False), 
('39972', False), ('39987', False), ('39987', False), ('40014', False), 
('40329', False), ('40658', False), ('40658', False), ('40686', False), 
('40714', False), ('40732', False), ('40809', False), ('40827', False), 
('41043', False), ('41043', False), ('41071', False), ('41071', False), 
('41080', False), ('41101', False), ('41101', True), ('41134', False)], 
'Escherichia coli': [('39274', False), ('39610', False), ('39638', False), 
('39638', False), ('39770', False), ('39770', False), ('39967', False), 
('39972', False), ('39987', False), ('39987', False), ('40014', False), 
('40329', False), ('40658', False), ('40658', False), ('40686', False), 
('40714', False), ('40732', False), ('40809', False), ('40827', False), 
('41043', False), ('41043', False), ('41071', False), ('41071', False), 
('41080', False), ('41101', False), ('41101', True), ('41134', False)]} 

现在,我输入了这个问题,我意识到这个问题是我迭代的日期,那么污染物,但我想,编译日期的名单,但独立的污染物。我将如何制作这样的清单并将其添加到字典中?

+1

重读您的文章两次之后,我想通了,你问的大多是什么,但它会简单得多,如果你只是发布一个你想要的输出的例子,我不会伤害到我的头。您还没有发布完整的代码 - 例如,什么是'conventionalCriteriaList'? –

+0

那么,列表中的第一项总是被抛弃? –

+0

另外,每次执行'overDict [pollutant] = dateList'都是没有意义的......它是完全一样的列表。这就是为什么在你的字典中的值是完全一样的... –

回答

0

我会退后一步,想想你的方法。你让自己变得更难。首先,数据:

In [3]: data = [['12319825', '39274', {'pH': 8.1}], ['12319825', '39610', {'pH': 
    ...: 7.27}], 
    ...: ['12319825', '39638', {'pH': 7.87, 'Escherichia coli': 25.0}], 
    ...: ['12319825', '39770', {'pH': 7.47, 'Escherichia coli': 27.0}], 
    ...: ['12319825', '39967', {'pH': 8.36}], ['12319825', '39972', {'pH': 8.42}] 
    ...: , 
    ...: ['12319825', '39987', {'pH': 8.12, 'Escherichia coli': 8.0}], 
    ...: ['12319825', '40014', {'pH': 8.12}], ['12319825', '40329',{'pH': 8.45}], 
    ...: 
    ...: ['12319825', '40658', {'pH': 8.35, 'Escherichia coli': 6.3}], 
    ...: ['12319825', '40686', {'pH': 8.17}], 
    ...: ['12319825', '40714', {'pH': 8.13}], ['12319825', '40732', {'pH': 8.4}], 
    ...: 
    ...: ['12319825', '40809', {'pH': 8.42}], 
    ...: ['12319825', '40827', {'pH': 8.46}], 
    ...: ['12319825', '41043', {'pH': 8.42, 'Escherichia coli': 170.0}], 
    ...: ['12319825', '41071', {'pH': 8.24, 'Escherichia coli': 92.0}], 
    ...: ['12319825', '41080', {'pH': 8.4}], 
    ...: ['12319825', '41101', {'pH': 8.36, 'Escherichia coli': 560.0}], ['123198 
    ...: 25', '41134', {'pH': 8.67}]] 

当你的布尔条件,哪怕是一点点复杂,你应该给他们自己的功能,如果只是为了可读性的原因。在这里,我会走得更远,并将它们添加到字典中,其中关键是相应的污染物,这将使您的生活变得非常简单!

In [4]: def ecoli_threshold(value): return value > 320 

In [5]: def ph_range(value): return not (5 < value < 9) 

In [6]: test = {'Escherichia coli': ecoli_threshold, 'pH':ph_range} 

跳闸您的关键问题是,您使用的是单名单,但你真的需要。用两个空列表初始化你的字典,因为你知道你会追加到它们。

In [7]: over_dict = {'Escherichia coli':[], 'pH':[]} 

最后,遍历数据:

In [8]: for entry in data: 
    ...:  for pollutant, value in entry[2].items(): 
    ...:   over_dict[pollutant].append((entry[1], test[pollutant](value))) 
    ...: 

最后,输出:

In [9]: over_dict 
Out[9]: 
{'Escherichia coli': [('39638', False), 
    ('39770', False), 
    ('39987', False), 
    ('40658', False), 
    ('41043', False), 
    ('41071', False), 
    ('41101', True)], 
'pH': [('39274', False), 
    ('39610', False), 
    ('39638', False), 
    ('39770', False), 
    ('39967', False), 
    ('39972', False), 
    ('39987', False), 
    ('40014', False), 
    ('40329', False), 
    ('40658', False), 
    ('40686', False), 
    ('40714', False), 
    ('40732', False), 
    ('40809', False), 
    ('40827', False), 
    ('41043', False), 
    ('41071', False), 
    ('41080', False), 
    ('41101', False), 
    ('41134', False)]} 
+0

非常感谢你的反馈!复杂的是,这个代码会考虑更多的污染物,而且并不是所有的地点都有污染物,所以手工添加清单很难,但我认为使用这些评论我可以制定一个方法。谢谢! –

+0

@AmeliaMcClure然后你最好的选择是使用'defaultdict',并且应该相对直接地扩展上面这个方法的其余部分。 –