您只能使用to_datetime
:
print (df)
DateTime
0 3/1/2016 12:15:00 AM
1 3/1/2016 12:30:00 AM
2 3/1/2016 12:45:00 AM
3 3/1/2016 1:00:00 AM
4 3/1/2016 1:15:00 AM
5 3/1/2016 1:30:00 AM
6 3/1/2016 1:45:00 AM
7 3/1/2016 2:00:00 AM
8 3/1/2016 2:15:00 PM <-date is changed for better testing
df.DateTime = pd.to_datetime(df.DateTime)
print (df)
DateTime
0 2016-03-01 00:15:00
1 2016-03-01 00:30:00
2 2016-03-01 00:45:00
3 2016-03-01 01:00:00
4 2016-03-01 01:15:00
5 2016-03-01 01:30:00
6 2016-03-01 01:45:00
7 2016-03-01 02:00:00
8 2016-03-01 14:15:00
编辑:
这时需要参数errors='coerce'
为以NaT
替换有问题的值:
print (df)
DateTime
0 3/1/2016 28:15:00 AM <- wrong date
1 3/1/2016 12:30:00 AM
2 3/1/2016 12:45:00 AM
3 3/1/2016 1:00:00 AM
4 3/1/2016 1:15:00 AM
5 3/1/2016 1:30:00 AM
6 3/1/2016 1:45:00 AM
7 3/1/2016 2:00:00 AM
8 3/1/2016 2:15:00 PM
df.DateTime = pd.to_datetime(df.DateTime, errors='coerce')
print (df)
DateTime
0 NaT
1 2016-03-01 00:30:00
2 2016-03-01 00:45:00
3 2016-03-01 01:00:00
4 2016-03-01 01:15:00
5 2016-03-01 01:30:00
6 2016-03-01 01:45:00
7 2016-03-01 02:00:00
8 2016-03-01 14:15:00
为了检查有问题的值,用boolean indexing
:
print (df[pd.to_datetime(df.DateTime, errors='coerce').isnull()])
DateTime
0 3/1/2016 28:15:00 AM
谢谢,我试过,但我得到这个错误:ValueError异常:未知的字符串格式。 – nish
请检查编辑答案。 – jezrael
已检查。它的作品谢谢你:) – nish