2013-12-19 22 views
-1

我需要将格式yyyy-mm-dd hh:mm:ss(代表datetime)的两个字符串之间的差值转换为integer。因为我想做到这一点在数据框对象(大熊猫修建),我需要的所有指标内置函数,做这样的事情将字符串'yyyy-mm-dd hh:mm:ss'日期转换为整数(pandas,python)

data['difference'] = somefunc(data['date1'],data['date2']) 

请问这样的功能存在吗?如果我构建自己的函数,我如何将它应用于DataFrame列?

在此先感谢!

回答

0

需要numpy> = 1.7。这是熊猫0.13(即将发布)。见文档here

In [3]: df = DataFrame(dict(A = Timestamp('20130101'), B = Timestamp('20130101')+ pd.to_timedelta(list(range(5)),unit='D'))) 

In [4]: df 
Out[4]: 
        A     B 
0 2013-01-01 00:00:00 2013-01-01 00:00:00 
1 2013-01-01 00:00:00 2013-01-02 00:00:00 
2 2013-01-01 00:00:00 2013-01-03 00:00:00 
3 2013-01-01 00:00:00 2013-01-04 00:00:00 
4 2013-01-01 00:00:00 2013-01-05 00:00:00 

[5 rows x 2 columns] 

In [5]: df.dtypes 
Out[5]: 
A datetime64[ns] 
B datetime64[ns] 
dtype: object 

In [6]: df['C'] = df['B']-df['A'] 

In [7]: df 
Out[7]: 
        A     B    C 
0 2013-01-01 00:00:00 2013-01-01 00:00:00   00:00:00 
1 2013-01-01 00:00:00 2013-01-02 00:00:00 1 days, 00:00:00 
2 2013-01-01 00:00:00 2013-01-03 00:00:00 2 days, 00:00:00 
3 2013-01-01 00:00:00 2013-01-04 00:00:00 3 days, 00:00:00 
4 2013-01-01 00:00:00 2013-01-05 00:00:00 4 days, 00:00:00 

[5 rows x 3 columns] 

In [8]: df.dtypes 
Out[8]: 
A  datetime64[ns] 
B  datetime64[ns] 
C timedelta64[ns] 
dtype: object 

In [9]: df['C'].astype('timedelta64[s]') 
Out[9]: 
0   0 
1  86400 
2 172800 
3 259200 
4 345600 
Name: C, dtype: float64 

在0.12,你可以做到这一点

In [1]: df = DataFrame(dict(A = Timestamp('20130101'), B = [Timestamp('20130101')+timedelta(days=i) for i in range(5) ])) 

In [2]: df['C'] = df['B']-df['A'] 

In [3]: Series(df['C'].values/np.timedelta64(1,'s')) 
Out[3]: 
0   0 
1  86400 
2 172800 
3 259200 
4 345600 
dtype: float64 
相关问题