保留最新值并丢弃较旧的行（熊猫）

我有一个数据帧表，其中包含新值和旧值。我想在保持新值的同时删除所有旧值。保留最新值并丢弃较旧的行（熊猫）

ID Name  Time Comment 
0  Foo 12:17:37 Rand 
1  Foo 12:17:37 Rand1 
2  Foo 08:20:00 Rand2 
3  Foo 08:20:00 Rand3 
4  Bar 09:01:00 Rand4 
5  Bar 09:01:00 Rand5 
6  Bar 08:50:50 Rand6 
7  Bar 08:50:00 Rand7

因此，它应该是这样的：

ID Name  Time Comment 
0  Foo 12:17:37 Rand 
1  Foo 12:17:37 Rand1 
4  Bar 09:01:00 Rand4 
5  Bar 09:01:00 Rand5

我试着用下面的代码，但这种删除1新1旧值。

df[~df[['Time', 'Comment']].duplicated(keep='first')]

任何人都可以提供正确的解决方案吗？

来源

2017-01-10 germanfox

我想你可以使用此解决方案与to_timedelta，如果Time列的最大值需要过滤：

df.Time = pd.to_timedelta(df.Time) 
df = df[df.Time == df.Time.max()] 
print (df) 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1

编辑解决方案类似，只是增加groupby：

df = df.groupby('Name', sort=False) 
     .apply(lambda x: x[x.Time == x.Time.max()]) 
     .reset_index(drop=True) 
print (df) 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1 
2 4 Bar 09:01:00 Rand4 
3 5 Bar 09:01:00 Rand5

来源

2017-01-10 08:42:39 jezrael

您是否可以编辑问题，因为评论的格式不合适？ – jezrael

如果解决方案无法正常工作，请尝试使用所需的输出创建[最小，完整和可验证的示例]（http://stackoverflow.com/help/mcve）。 – jezrael

会做。顺便说一下，这工作，但不是我在找什么。让我更新这个问题。 – germanfox

您可以合并组的最大值回到原来的DF：

df['Time'] = pd.to_timedelta(df['Time']) 

In [35]: pd.merge(df, df.groupby('Name', as_index=False)['Time'].max(), on=['Name','Time']) 
Out[35]: 
    ID Name  Time Comment 
0 0 Foo 12:17:37 Rand 
1 1 Foo 12:17:37 Rand1 
2 4 Bar 09:01:00 Rand4 
3 5 Bar 09:01:00 Rand5

说明：

In [36]: df.groupby('Name', as_index=False)['Time'].max() 
Out[36]: 
    Name  Time 
0 Bar 09:01:00 
1 Foo 12:17:37

来源

2017-01-10 08:59:51 MaxU

保留最新值并丢弃较旧的行（熊猫）

回答

相关问题