df.set_index（）在日期时间对象列表列中，未来日期不起作用。

d = {'one':[datetime.datetime(3000, 6, 1, 0, 0), datetime.datetime(2016, 6, 1, 0, 0), datetime.datetime(2016, 7, 1, 0, 0), datetime.datetime(2016, 6, 1, 0, 0),], 'two':[1,2,3,4,5,6,7,8,9,10,11,12,13,14]} 

df = pd.DataFrame(d) 
print df 
df = df.set_index(['one']) 
print df 



ERROR: At 

df = df.set_index(['one']) 

ValueError: Unable to convert [datetime.datetime(3000, 6, 1, 0, 0) datetime.datetime(2016, 6, 1, 0, 0) datetime.datetime(2016, 7, 1, 0, 0) datetime.datetime(2016, 6, 1, 0, 0) datetime.datetime(2016, 7, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0) atetime.datetime(2016, 5, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0) datetime.datetime(2016, 6, 1, 0, 0) datetime.datetime(2016, 2, 1, 0, 0) datetime.datetime(2016, 5, 1, 0, 0)] to datetime dtype

但它工作很好，日期在2000年至2000年的年份内。df.set_index（）在日期时间对象列表列中，未来日期不起作用。

不知道这里有什么问题。有人可以帮助我吗？

在此先感谢。

Venkat

来源

2016-07-14 Venkat Reddy

代码中引发了我不同的例外情况（一SyntaxError，一个ValueError: arrays must all be same length和pandas.tslib.OutOfBoundsDatetime: Out of bounds错误），但我觉得最后一个，OutOfBoundsDatetime是指你所看到的同样的问题。

当从包含类日期对象的数据构建DataFrame时，日期将转换为NumPy dtype。例如，

import datetime as DT 
import pandas as pd 

df = pd.DataFrame({'one':[DT.datetime(2000, 6, 1, 0, 0), DT.datetime(2016, 6, 1, 0, 0), DT.datetime(2016, 7, 1, 0, 0), DT.datetime(2016, 6, 1, 0, 0),], 'two':[1,2,3,4]}) 

print(df.info()) 
# <class 'pandas.core.frame.DataFrame'> 
# RangeIndex: 4 entries, 0 to 3 
# Data columns (total 2 columns): 
# one 4 non-null datetime64[ns] # <-- Notice the dtype 
# two 4 non-null int64 
# dtypes: datetime64[ns](1), int64(1) 
# memory usage: 144.0 bytes

目前，datetime64[ns]是only NumPy datetime64 data type supported的熊猫。 The range of dates该数据类型可以表示为[1678 AD, 2262 AD]。因此，如果datetime.datetime对象引用此范围之外的日期，则会发生异常。

来源

2016-07-14 16:39:29 unutbu

由于在pandas documentation提到，熊猫Timestamp对象只能达到一年2262然而，the documentation also describes a way around this limitation.

的想法是，如果你不需要datetime64 D型的纳秒分辨率，可以使用PeriodIndex以实现期望的结果。

在你的情况下，它看起来像你可能想沿着线的东西：

s = pd.Series([30000601, 20160601, 20160701, 20160501]) 
def conv(x): 
    return pd.Period(year = x // 10000, month = x//100 % 100, day = x%100, freq='D') 
span = pd.PeriodIndex(s.apply(conv)) 
df.index = span

来源

2016-07-14 16:53:17

最后我得到了它的工作。

s = pd.Series([30000601, 20160601, 20160701, 20160501]) 
def conv(x): 
    return pd.Period(year = x // 10000, month = x//100 % 100, day = x%100,  freq='D') 
span = pd.PeriodIndex(s.apply(conv)) 
df.index = span

谢谢你的帮助。

来源

2017-03-24 15:28:42

df.set_index（）在日期时间对象列表列中，未来日期不起作用。

回答

相关问题