你可以编造一个功能,您的日期字符串格式转换。然后它可以应用到列以转换为日期时间。此功能可以返回时区aware or naive timestamps。
代码:
import datetime as dt
import pytz
def convert_to_datetime(tz=None):
""" Convert our custom timezone representation to a datetime
Timestamp looks like: 2012-05-02 01:00:00-05:00
:param tz: None, returns UTC relative Naive
True, returns timezone aware timestamp in UTC
<tz>, returns timezone aware timestamp in given timezone
:return: returns a processing function that can be passed to apply()
"""
def func(datetime_string):
time = datetime_string[:19]
tz_str = datetime_string[19:]
# parse the timezone offset to minutes and seconds
tz_offset = int(
tz_str[0] + str(int(tz_str[1:3]) * 60 + int(tz_str[4:])))
# return a datetime that is offset
result = dt.datetime.strptime(time, '%Y-%m-%d %H:%M:%S') - \
dt.timedelta(minutes=tz_offset)
if tz is not None:
result = result.replace(tzinfo=pytz.UTC)
if tz is not True:
result = result.astimezone(tz)
return result
return func
测试代码:
df = pd.DataFrame([
'2012-05-02 01:00:00-05:00',
'2012-05-02 03:00:00-05:00'],
columns=['timestamp'])
df['zulu_no_tz'] = df.timestamp.apply(convert_to_datetime())
df['utc_tz'] = df.timestamp.apply(convert_to_datetime(tz=True))
df['local_tz'] = df.timestamp.apply(convert_to_datetime(
tz=pytz.timezone('US/Central')))
print(df)
测试结果:
timestamp zulu_no_tz utc_tz \
0 2012-05-02 01:00:00-05:00 2012-05-02 06:00:00 2012-05-02 06:00:00+00:00
1 2012-05-02 03:00:00-05:00 2012-05-02 08:00:00 2012-05-02 08:00:00+00:00
local_tz
0 2012-05-02 01:00:00-05:00
1 2012-05-02 03:00:00-05:00
使用dateutil
:
如果您有权访问dateutil
,则可以使用它们的解析代码。这是上述func
的替代品,它可以很好地处理日期格式。
import dateutil
def func(datetime_string):
result = dateutil.parser.parse(datetime_string).astimezone(pytz.UTC)
if tz is None:
result = result.replace(tzinfo=None)
elif tz is not True:
result = result.astimezone(tz)
return result
您也可以使用dateutil.parser
裸体apply()
为:
import dateutil
df.timestamp.apply(dateutil.parser.parse)
我不是这种风格的忠实粉丝,因为它适用于一个固定的偏移时区,这意味着它是不是夏令时知道的。我个人更喜欢夏令时或者简单的UTC。