2017-11-11 138 views
1

在XTS对象特定时间之前查找数据我有两个xts数据集的订单和市场数据,它们是类似以下内容:立即R中

订购书:

Time     |  Price 
------------------------------------- 
2017-01-02 10:00:02 |  5.00 
2017-01-02 10:00:05 |  6.00 
2017-01-02 10:00:13 |  5.00 
2017-01-02 10:00:16 |  4.00 
2017-01-02 10:00:24 |  2.00 

市场数据:现在

Time     |  Ask Price 
--------------------------------------- 
2017-01-02 10:00:01 |  4.00 
2017-01-02 10:00:02 |  3.00 
2017-01-02 10:00:27 |  1.00 
2017-01-02 10:00:56 |  2.00 
2017-01-02 10:00:57 |  1.00 

,在订单每个观察我想在邻订单的时间之前stictly找到市场数据观察书本。例如,从上述两个数据集中,如果我查看订单簿中的观察值3,则市场数据严格在此之前位于市场数据的指数2(即时间10:00:05)。

现在,我只能遵循两个条件 - 首先,正如我之前提到的,市场数据观察必须严格在订单观察之前。第二个条件是这两个观察必须发生在同一天。我已经写了两个不同的函数来尝试和执行这个任务,但都给出了不同的结果,所以我很确定我错了。如果有人能够帮助我一点,那么我真的很感激!提前致谢。

回答

0
require(data.table) 
require(lubridate) 
################ 
# Recreate data 
################# 
s<- 
    "2017-01-02 10:00:02 5.00 
    2017-01-02 10:00:05 6.00 
    2017-01-02 10:00:13 5.00 
    2017-02-02 10:00:16 4.00 
    2017-02-02 10:00:24 2.00" 
# note I changed the date in last two lines to answer your question 
s1 <- 
    "2017-01-02 10:00:01 4.00 
    2017-01-02 10:00:02 3.00 
    2017-01-02 10:00:27 1.00 
    2017-01-02 10:00:56 2.00 
    2017-01-02 10:00:57 1.00" 
# I'm reading from delimiter, day and minutes come in two separates columns 
O <- read.delim(textConnection(s),header=FALSE,sep=" ",strip.white=TRUE) 
M <- read.delim(textConnection(s1),header=FALSE,sep=" ",strip.white=TRUE) 
setDT(O);setDT(M) 
setnames(O,c("time","m","price"));setnames(M,c("time","m","ask_price")) 
O[,time:=paste(time,m)];M[,time:=paste(time,m)] # paste hours and minutes 
O[,m:=NULL];M[,m:=NULL] # remove minutes 
O[,time:= lubridate::ymd_hms(time)];M[,time:= lubridate::ymd_hms(time)]# extract time 

################ 
# Analysis 
################# 
setkey(O,time); setkey(M,time) 
# this will identify all orders (O) after Market sessions (M) 
M[O, roll = T] 
#time ask_price price 
#1: 2017-01-02 10:00:02   3  5 
#2: 2017-01-02 10:00:05   3  6 
#3: 2017-01-02 10:00:13   3  5 
#4: 2017-02-02 10:00:16   1  4 
#5: 2017-02-02 10:00:24   1  2 

# this will identify all orders (O) after market session (M) 
# within a time span of 24 hours 
twenty_four_hours<-60*60*24 
res<-M[O, roll = twenty_four_hours] 
#time ask_price price 
#1: 2017-01-02 10:00:02   3  5 
#2: 2017-01-02 10:00:05   3  6 
#3: 2017-01-02 10:00:13   3  5 
#4: 2017-02-02 10:00:16  NA  4 
#5: 2017-02-02 10:00:24  NA  2 
res[!is.na(ask_price),] # now we remove lines without values 
# time ask_price price 
#1: 2017-01-02 10:00:02   3  5 
#2: 2017-01-02 10:00:05   3  6 
#3: 2017-01-02 10:00:13   3  5