2014-10-29 70 views
1

我有非唯一键的数据表:与非唯一键,唯一加盟我

> dput(sv) 
structure(list(kwd = c("a", "a", "b", "b", "c"), pixel = c(1, 
2, 1, 2, 2), kpN = c(2L, 2L, 2L, 1L, 1L)), row.names = c(NA, 
-5L), class = c("data.table", "data.frame"), .Names = c("kwd", 
"pixel", "kpN"), .internal.selfref = <pointer: 0x7fc4aa800778>, sorted = "kwd") 
> dput(kwd) 
structure(list(kwd = c("a", "b", "c", "z"), kwdN = c(3L, 2L, 
1L, 1L)), row.names = c(NA, -4L), class = c("data.table", "data.frame" 
), .Names = c("kwd", "kwdN"), .internal.selfref = <pointer: 0x7fc4aa800778>, sorted = "kwd") 

为什么我收到此错误:

> sv[kwd,kwdN:=kwdN] 
Starting bmerge ...done in 0 secs 
Error in vecseq(f__, len__, if (allow.cartesian || notjoin) NULL else as.integer(max(nrow(x), : 
    Join results in 6 rows; more than 5 = max(nrow(x),nrow(i)). Check for duplicate key values in i, each of which join to the same group in x over and over again. If that's ok, try including `j` and dropping `by` (by-without-by) so that j runs for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice. 
Calls: [ -> [.data.table -> vecseq 

我希望这样的事情(注意键:

kwd pixel kpN kwdN 
1: a  1 2 3 
2: a  2 2 3 
3: b  1 2 2 
4: b  2 1 2 
5: c  2 1 1 

而且,我敢肯定,这工作之前那样

这是什么改变了data.table 1.9.4

我如何得到我想要的? (kwd[sv]似乎工作,是新的方式?)

+0

试试'sv [kwd,kwdN:= i.kwdN]' – akrun 2014-10-29 16:01:23

+0

'allow.cartesian'错误不应该在这里弹出。这已在1.9.5中修复。检查点8下的错误修复为1.9.5 [这里](https://github.com/Rdatatable/data.table/blob/master/README.md)。当'i'重复时,那么就像错误信息已经说过的那样,你应该使用'allow.cartesian = TRUE'。 – Arun 2014-10-29 16:03:47

+0

@阿伦:我有1.9.4 – sds 2014-10-29 16:10:16

回答

1

正是如此,这仍然回答:

allow.cartesian功能是从后@Roland this后实施。另请参阅this以获取更多解释。

例,其中allow.cartesian是没有必要的(因此不应该错误)为:

  • i时没有重复#742 - 这不是之前正确检查。固定在1.9.5(当前的开发版本)。

  • j:=#800 - 行的数量绝不会超过x。固定在1.9.5(当前的开发版本)。

  • 当操作是未加入(或反连接),#698 - 行数将永远不会再次超过x。修正于1.9.4。

总之,allow.cartesian错误只发生在必要的地方。在CRAN上发布1.9.6时,1.9.5中所做的修复将可用(应该很快就会发布)。