我会试着坚持现有numpy的/ SciPy的功能,因为它们是非常快速和优化(numpy.hypot):
df['wspeed'] = np.hypot(df.latwind, df.lonwind)
时间:针对300K行DF:
In [47]: df = pd.concat([df] * 10**5, ignore_index=True)
In [48]: df.shape
Out[48]: (300000, 2)
In [49]: %paste
def wind_speed(u, v):
return np.sqrt(u ** 2 + v ** 2)
## -- End pasted text --
In [50]: %timeit list(map(wind_speed, df['lonwind'], df['latwind']))
1 loop, best of 3: 922 ms per loop
In [51]: %timeit np.hypot(df.latwind, df.lonwind)
100 loops, best of 3: 4.08 ms per loop
结论:矢量化的方法是230倍更快
如果你写你自己一个人时,尽量使用矢量数学(带矢量/列,而不是标量工作):
def wind_speed(u, v):
# using vectorized approach - column's math instead of scalar
return np.sqrt(u * u + v * v)
df['wspeed'] = wind_speed(df['lonwind'] , df['latwind'])
演示:
In [39]: df['wspeed'] = wind_speed(df['lonwind'] , df['latwind'])
In [40]: df
Out[40]:
latwind lonwind wspeed
0 4 1 4.123106
1 5 2 5.385165
2 6 3 6.708204
同一矢量的方法与celsius()
功能:
def celsius(T):
# using vectorized function: np.round()
return np.round(T - 273, 1)
但功能'map'改变? – Hugo