2017-01-02 35 views
2

祝大家新年快乐基于时间戳R的快速数据填充

回到我的问题:

我有两个数据集。

Dataset 1

Time   Name   Value 
    6/1/2016 9:39 ABCD IS Equity 11.01 
    6/1/2016 9:44 ABCD IS Equity 11.05 
    6/1/2016 9:46 ABCD IS Equity 11.01 
    6/1/2016 9:58 ABCD IS Equity 11.01 
    6/1/2016 10:10 ABCD IS Equity 11.01 
    6/1/2016 10:13 ABCD IS Equity 11.01 
    6/1/2016 10:33 ABCD IS Equity 11.02 
    6/1/2016 10:42 ABCD IS Equity 11.02 
    6/1/2016 10:52 ABCD IS Equity 11.02 
    6/1/2016 10:56 ABCD IS Equity 11.06 
    6/1/2016 11:14 ABCD IS Equity 11.02 
    6/1/2016 11:25 ABCD IS Equity 11.03 
    6/1/2016 11:26 ABCD IS Equity 11.03 
    6/1/2016 11:29 ABCD IS Equity 11.03 
    6/1/2016 11:30 ABCD IS Equity 11.03 
    6/1/2016 11:40 ABCD IS Equity 11.03 
    6/1/2016 11:40 ABCD IS Equity 11.01 
    6/1/2016 11:44 ABCD IS Equity 11.01 
    6/1/2016 12:04 ABCD IS Equity 11.01 

Dataset 2

Time2   Name2   Value2 
6/1/2016 9:42 ABCD IS Equity 123 
6/1/2016 9:45 ABCD IS Equity 124 
6/1/2016 9:45 ABCD IS Equity 125 
6/1/2016 10:00 ABCD IS Equity 126 
6/1/2016 10:14 ABCD IS Equity 127 
6/1/2016 10:14 ABCD IS Equity 128 
6/1/2016 10:14 ABCD IS Equity 129 
6/1/2016 10:41 ABCD IS Equity 130 
6/1/2016 10:45 ABCD IS Equity 131 
6/1/2016 10:56 ABCD IS Equity 132 
6/1/2016 10:58 ABCD IS Equity 133 
6/1/2016 11:26 ABCD IS Equity 134 
6/1/2016 11:27 ABCD IS Equity 135 
6/1/2016 11:30 ABCD IS Equity 136 
6/1/2016 11:32 ABCD IS Equity 137 
6/1/2016 11:40 ABCD IS Equity 138 
6/1/2016 11:42 ABCD IS Equity 139 
6/1/2016 11:45 ABCD IS Equity 140 
6/1/2016 12:05 ABCD IS Equity 141 

现在,我想创建Dataset 1一个New列,其将从Dataset2Value2基于对各行条件Dataset2$Time2 > Dataset1$Time填充值Dataset 1

下面是示例output:从Value2

Time   Name   Value New 
6/1/2016 9:39 ABCD IS Equity 11.01 123 
6/1/2016 9:44 ABCD IS Equity 11.05 124 
6/1/2016 9:46 ABCD IS Equity 11.01 126 
6/1/2016 9:58 ABCD IS Equity 11.01 126 
6/1/2016 10:10 ABCD IS Equity 11.01 127 
6/1/2016 10:13 ABCD IS Equity 11.01 127 
6/1/2016 10:33 ABCD IS Equity 11.02 130 
6/1/2016 10:42 ABCD IS Equity 11.02 131 
6/1/2016 10:52 ABCD IS Equity 11.02 132 
6/1/2016 10:56 ABCD IS Equity 11.06 133 
6/1/2016 11:14 ABCD IS Equity 11.02 134 
6/1/2016 11:25 ABCD IS Equity 11.03 134 
6/1/2016 11:26 ABCD IS Equity 11.03 135 
6/1/2016 11:29 ABCD IS Equity 11.03 136 
6/1/2016 11:30 ABCD IS Equity 11.03 137 
6/1/2016 11:40 ABCD IS Equity 11.03 139 
6/1/2016 11:40 ABCD IS Equity 11.01 139 
6/1/2016 11:44 ABCD IS Equity 11.01 140 
6/1/2016 12:04 ABCD IS Equity 11.01 141 

相同值的基础上,匹配条件的不同Dataset1行可能填充。

Soln。我曾尝试过:

我试过使用简单的for循环[1: nrow(Dataset1)]来匹配每行Dataset2。但是我有一个很大的数据集,需要花费很长时间。我正在寻找更快的方式 - 它可以跳过使用for循环。

任何帮助/建议,将不胜感激。

+0

我们可以使用'data.table'即'setDT(DF1 )[df2,Value:= Value2,on =。(Name,Time2> Time1)]' – akrun

+0

说如果我有另一个名为'Zone'的公共列。 ('Name',Zone,Time2> Time1)' – Zico

+0

是的,你可以做到这一点 – akrun

回答

1

一个可能的选择是findIntervalbase R

df2$New <- df2$Value2[findInterval(df1$Time, df2$Time2)+1] 

注:我们假设 '时间', '时间2' 是POSIXct

+1

'timestamp'需要在应用'findInterval'之前进行排序。工作正常。确实是gr8解决方案。谢谢。 – Zico