我刚刚开始使用MRJob库在Python中编写MapReduce程序。在单个mapreduce中同时产生最大值和最小值
在视频教程中演示的一个示例是通过location_id查找最高温度。接下来写的另一个程序,通过location_id找到最低温度也很简单。
我在想,是否有一种方法可以通过location_id在单个mapreduce程序中产生最大和最小温度?下面是我走在它:
from mrjob.job import MRJob
'''Sample Data
ITE00100554,18000101,TMAX,-75,,,E,
ITE00100554,18000101,TMIN,-148,,,E,
GM000010962,18000101,PRCP,0,,,E,
EZE00100082,18000101,TMAX,-86,,,E,
EZE00100082,18000101,TMIN,-135,,,E,
ITE00100554,18000102,TMAX,-60,,I,E,
ITE00100554,18000102,TMIN,-125,,,E,
GM000010962,18000102,PRCP,0,,,E,
EZE00100082,18000102,TMAX,-44,,,E,
Output I am expecting to see:
ITE00100554 32.3 20.2
EZE00100082 34.4 19.6
'''
class MaxMinTemperature(MRJob):
def mapper(self, _, line):
location, datetime, measure, temperature, w, x, y, z = line.split(',')
temperature = float(temperature)/10
if measure == 'TMAX' or measure == 'TMIN':
yield location, temperature
def reducer(self, location, temperatures):
yield location, max(temperatures), min(temperatures)
if __name__ == '__main__':
MaxMinTemperature.run()
我得到以下错误:
File "MaxMinTemperature.py", line 12, in reducer
yield location, max(temperatures), min(temperatures)
ValueError: min() arg is an empty sequence
这可能吗?
感谢您的协助。
希夫
谢谢你@ AleksandrBorisov。了解你的解决方案在做什么! –