2016-08-17 56 views
1

我有以下data.tables:data.table参考列的名字

Comparison <- data.table(code = c("AAA", "BBB"), 
         elem1 = c(1, 2), 
         elem2 = c(4, 4)) 

DT <- data.table(A = c("AAA", "AAA", "AAA", "AAA"), 
       B = c("BBB", "BBB", "BBB", "BBB"), 
       C = c(1, 2, 3, 4)) 

现在,我想从Comparison添加基于列的比较新的列和从DT。下面的命令生成所期望的输出:

DT[, newCol := {ifelse(abs(C - Comparison[code == "AAA", elem2]) == 0, "0", "1")}] 

Output: 

    A B C newCol 
1: AAA BBB 1  1 
2: AAA BBB 2  1 
3: AAA BBB 3  1 
4: AAA BBB 4  0 

然而,如果不是硬编码列A的列值,我使用了柱本身与此:

DT[, newCol := {ifelse(abs(C - Comparison[code == A, elem2]) > 0, "0", "1")}] 

它输出以下错误,这我不知道如何避免:

Error in `[.data.table`(Comparison, code == A, elem2) : 
    RHS of == is length 4 which is not 1 or nrow (2). For robustness, no recycling is allowed (other than of length 1 RHS). Consider %in% instead. 

在我看来,该操作未矢量化的列元素AØ f DT in Comparison,我不太明白为什么,因为C列的元素是正确的(即,它单独使用C的元素,但不使用A的元素)。我怎么能做这个比较?

任何帮助将不胜感激。

回答

1

如果你读了错误信息,它说Consider %in% instead.

事实上与%in%它的工作原理更换==,而不必使用joinmerge我们可以使用joinon

DT[Comparison, newCol := as.integer(C != elem2), on = c("A" = "code"), nomatch = 0] 
DT 
#  A B C newCol 
#1: AAA BBB 1  1 
#2: AAA BBB 2  1 
#3: AAA BBB 3  1 
#4: AAA BBB 4  0 
1

一个解决方案是做数据合并。

require(data.table) 

Comparison <- data.table(code = c("AAA", "BBB"), 
         elem1 = c(1, 2), 
         elem2 = c(4, 4)) 
Comparison 

DT <- data.table(A = c("AAA", "AAA", "AAA", "AAA"), 
       B = c("BBB", "BBB", "BBB", "BBB"), 
       C = c(1, 2, 3, 4)) 
DT 

tmp <- merge(DT, Comparison, by.x = "A", by.y = "code") 
tmp[, newCol := as.character(as.integer(C != elem2))] 
tmp 
0

DT[, newCol := {ifelse(abs(C - Comparison[code %in% A, elem2]) = 0, "0", "1")} 

DT 

#  A B C newCol 
#1: AAA BBB 1  1 
#2: AAA BBB 2  1 
#3: AAA BBB 3  1 
#4: AAA BBB 4  0