2016-07-15 90 views
1

我有数据经常变化的熊猫数据框,看起来像这样:设置时区

  date name time  timezone 
0 2016-08-01 aaa 0900  Asia/Tokyo 
1 2016-08-04 bbb 1200 Europe/Berlin 
2 2016-08-05 ccc 1400 Europe/London 

的日期,时间和时区指的往往是国外的位置的交货日期,其名称是客户公司的名称。

计划将采取此数据并创建一个datetime_local列,该列包含数据帧的timezone列中显示的时区。然后,我想添加一个包含日期和时间的列datetime_london,但在伦敦时间和日期方面表示。

我已经得到了大部分的方式,但拨打电话tz_localize时,我最终得到了一个ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(),这暗示了我没有正确处理带有时区的列。

有关如何进行的任何建议?

mydf = pd.DataFrame(data={'date':['2016-08-01','2016-08-04','2016-08-05'], 
          'time':['0900','1200','1400'], 
          'timezone':['Asia/Tokyo','Europe/Berlin','Europe/London'], 
          'name':['aaa','bbb','ccc']} 
) 
print(mydf) 
mydf["datetime"] = mydf["date"].map(str) + " " + mydf["time"] 
mydf.datetime = pd.to_datetime(mydf.datetime) 
mydf.index = mydf.datetime 
print(mydf) 
mydf["datetime_local"] = mydf.datetime 
mydf.datetime_local.tz_localize(mydf.timezone) 

回答

1
import pandas as pd 

def convert_to_local_time(row): 
    return pd.to_datetime(row.datetime).tz_localize(row.timezone) 

def convert_to_london_time(row): 
    return pd.to_datetime(row.datetime_local).tz_convert('Europe/London') 

mydf = pd.DataFrame(data={'date':['2016-08-01','2016-08-04','2016-08-05'], 
          'time':['0900','1200','1400'], 
          'timezone':['Asia/Tokyo','Europe/Berlin','Europe/ London'], 
          'name':['aaa','bbb','ccc']} 
) 
print(mydf) 

输出:

  date name time  timezone 
0 2016-08-01 aaa 0900  Asia/Tokyo 
1 2016-08-04 bbb 1200 Europe/Berlin 
2 2016-08-05 ccc 1400 Europe/London 

添加datetime_local

mydf["datetime"] = mydf["date"].map(str) + " " + mydf["time"] 
mydf['datetime_local'] = mydf.apply(convert_to_local_time, axis=1) 
print(mydf) 

输出:

  date name time  timezone   datetime \ 
0 2016-08-01 aaa 0900  Asia/Tokyo 2016-08-01 0900 
1 2016-08-04 bbb 1200 Europe/Berlin 2016-08-04 1200 
2 2016-08-05 ccc 1400 Europe/London 2016-08-05 1400 

       datetime_local 
0 2016-08-01 09:00:00+09:00 
1 2016-08-04 12:00:00+02:00 
2 2016-08-05 14:00:00+01:00 

添加datetime_london

mydf['datetime_london'] = mydf.apply(convert_to_london_time, axis=1) 
print('After adding datetime_london:') 
print(mydf) 

输出:

  date name time  timezone   datetime \ 
0 2016-08-01 aaa 0900  Asia/Tokyo 2016-08-01 0900 
1 2016-08-04 bbb 1200 Europe/Berlin 2016-08-04 1200 
2 2016-08-05 ccc 1400 Europe/London 2016-08-05 1400 

       datetime_local   datetime_london 
0 2016-08-01 09:00:00+09:00 2016-08-01 01:00:00+01:00 
1 2016-08-04 12:00:00+02:00 2016-08-04 11:00:00+01:00 
2 2016-08-05 14:00:00+01:00 2016-08-05 14:00:00+01:00 
1

试试这个:

In [12]: mydf.apply(lambda x: x.datetime_local.tz_localize(x.timezone), axis=1) 
Out[12]: 
datetime 
2016-08-01 09:00:00 2016-08-01 09:00:00+09:00 
2016-08-04 12:00:00 2016-08-04 12:00:00+02:00 
2016-08-05 14:00:00 2016-08-05 14:00:00+01:00 
dtype: object