2013-07-20 115 views
2

我在Mac OS X上使用熊猫0.11。我试图用熊猫导入一个csv文件read_csv,其中一列是完整的时间戳,值如下:防止大熊猫read_csv截断完整的时间戳

fullts 
1374087067.357464 
1374087067.256206 
1374087067.158231 
1374087067.074162 

我很感兴趣,让后续的时间戳之间的时间差,所以我将其导入指定dtype

data = read_csv(fn, dtype={'fullts': float64}) 

然而,大熊猫似乎截断数量的整数部分:

data.fullts.head(4) 

产量:

1374087067 
1374087067 
1374087067 
1374087067 

有什么建议?

谢谢!

补充:尝试使用pd.to_datetime的建议,并得到这个错误:

--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
<ipython-input-8-37ed0da45608> in <module>() 
---> 1 pd.to_datetime(sd1.fullts) 

/Users/user/anaconda/lib/python2.7/site-packages/pandas-0.11.0-py2.7-macosx-10.5-x86_64.egg/pandas/tseries/tools.pyc in to_datetime(arg, errors, dayfirst, utc, box, format) 
    102   values = arg.values 
    103   if not com.is_datetime64_dtype(values): 
--> 104    values = _convert_f(values) 
    105   return Series(values, index=arg.index, name=arg.name) 
    106  elif isinstance(arg, (np.ndarray, list)): 

/Users/user/anaconda/lib/python2.7/site-packages/pandas-0.11.0-py2.7-macosx-10.5-x86_64.egg/pandas/tseries/tools.pyc in _convert_f(arg) 
    84    else: 
    85     result = tslib.array_to_datetime(arg, raise_=errors == 'raise', 
---> 86             utc=utc, dayfirst=dayfirst) 
    87    if com.is_datetime64_dtype(result) and box: 
    88     result = DatetimeIndex(result, tz='utc' if utc else None) 
/Users/user/anaconda/lib/python2.7/site-packages/pandas-0.11.0-py2.7-macosx-10.5-x86_64.egg/pandas/tslib.so in pandas.tslib.array_to_datetime (pandas/tslib.c:15411)() 

TypeError: object of type 'float' has no len() 
+1

它在我的电脑中正常工作,你使用哪种熊猫版本? – waitingkuo

+0

我在MAC OS X Lion上使用pandas 0.11和anaconda python Lion – Fra

回答

2

你并不需要指定D型时从CSV阅读(它应该默认使用float64)。

在大熊猫0.12可以使用的to_datetime单位参数整数或浮点数(的信号出现时间)转换成大熊猫时间戳的然后隐蔽列:

In [11]: df 
Out[11]: 
     fullts 
0 1.374087e+09 
1 1.374087e+09 
2 1.374087e+09 
3 1.374087e+09 

In [12]: pd.to_datetime(df.fullts) # default unit is ns 
Out[12]: 
0 1970-01-01 00:00:01.374087067 
1 1970-01-01 00:00:01.374087067 
2 1970-01-01 00:00:01.374087067 
3 1970-01-01 00:00:01.374087067 
Name: fullts, dtype: datetime64[ns] 

In [13]: pd.to_datetime(df.fullts, unit='s') 
Out[13]: 
0 2013-07-17 18:51:07.357464 
1 2013-07-17 18:51:07.256206 
2 2013-07-17 18:51:07.158231 
3 2013-07-17 18:51:07.074162 
Name: fullts, dtype: datetime64[ns] 

当文档字符串状态:

unit : unit of the arg (D,s,ms,us,ns) denote the unit in epoch
              (e.g. a unix timestamp), which is an integer/float number

+0

嘿Andy,谢谢你的回答。你认为这个问题可能是我安装了熊猫0.11吗? – Fra

+0

@Fra在0.11这不占用浮点值(但我认为它需要整数,所以要么乘以使它成为纳秒int或稍候等待0.12-out v)) –

+0

Pandas 0.12出来了,效果很好。谢谢 – Fra