合并dataframes，不同长度

我想从dat2添加变量：合并dataframes，不同长度

  concreteness familiarity typicality 
amoeba   3.60  1.30  1.71 
bacterium   3.82  3.48  2.13 
leech    5.71  1.83  4.50

要dat1：

ID variable value 
1 1 amoeba  0 
2 2 amoeba  0 
3 3 amoeba NA 
251 1 bacterium  0 
252 2 bacterium  0 
253 3 bacterium  0 
501 1  leech  1 
502 2  leech  1 
503 3  leech  0

给予以下的输出：

X ID variable value concreteness familiarity typicality 
1 1 1 amoeba  0   3.60  1.30  1.71 
2 2 2 amoeba  0   3.60  1.30  1.71 
3 3 3 amoeba NA   3.60  1.30  1.71 
4 251 1 bacterium  0   3.82  3.48  2.13 
5 252 2 bacterium  0   3.82  3.48  2.13 
6 253 3 bacterium  0   3.82  3.48  2.13 
7 501 1  leech  1   5.71  1.83  4.50 
8 502 2  leech  1   5.71  1.83  4.50 
9 503 3  leech  0   5.71  1.83  4.50

正如你所看到的来自dat1的信息必须复制到的多行中。

这是我失败的尝试：

dat3 <- merge(dat1, dat2, by=intersect(dat1$variable(dat1), dat2$row.names(dat2)))

Givng以下错误：

Error in as.vector(y) : attempt to apply non-function

请在这里找到复制的例子：

DAT1：

structure(list(ID = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), variable = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("amoeba", "bacterium", 
"leech", "centipede", "lizard", "tapeworm", "head lice", "maggot", 
"ant", "moth", "mosquito", "earthworm", "caterpillar", "scorpion", 
"snail", "spider", "grasshopper", "dust mite", "tarantula", "termite", 
"bat", "wasp", "silkworm"), class = "factor"), value = c(0L, 
0L, NA, 0L, 0L, 0L, 1L, 1L, 0L)), .Names = c("ID", "variable", 
"value"), row.names = c(1L, 2L, 3L, 251L, 252L, 253L, 501L, 502L, 
503L), class = "data.frame")

DAT2：

structure(list(concreteness = c(3.6, 3.82, 5.71), familiarity = c(1.3, 
3.48, 1.83), typicality = c(1.71, 2.13, 4.5)), .Names = c("concreteness", 
"familiarity", "typicality"), row.names = c("amoeba", "bacterium", 
"leech"), class = "data.frame")

来源

2012-12-31 Marloes

你可以加入一个变量添加到DAT2然后使用合并：

dat2$variable <- rownames(dat2) 
merge(dat1, dat2) 
    variable ID value concreteness familiarity typicality 
1 amoeba 1  0   3.60  1.30  1.71 
2 amoeba 2  0   3.60  1.30  1.71 
3 amoeba 3 NA   3.60  1.30  1.71 
4 bacterium 1  0   3.82  3.48  2.13 
5 bacterium 2  0   3.82  3.48  2.13 
6 bacterium 3  0   3.82  3.48  2.13 
7  leech 1  1   5.71  1.83  4.50 
8  leech 2  1   5.71  1.83  4.50 
9  leech 3  0   5.71  1.83  4.50

来源

2012-12-31 14:12:12 agstudy

此答案可与所示的采样数据但如果有'dat1'，则会丢弃所有不匹配的行。 –

@ G.Grothendieck好赶上！需要添加all.x = T。 – agstudy

没有错@ agstudy的答案，但是你可以不用通过创建一个匿名临时实际修改DAT2。添加X是相似的：

> merge(cbind(dat1, X=rownames(dat1)), cbind(dat2, variable=rownames(dat2))) 
    variable ID value X concreteness familiarity typicality 
1 amoeba 1  0 1   3.60  1.30  1.71 
2 amoeba 2  0 2   3.60  1.30  1.71 
3 amoeba 3 NA 3   3.60  1.30  1.71 
4 bacterium 1  0 251   3.82  3.48  2.13 
5 bacterium 2  0 252   3.82  3.48  2.13 
6 bacterium 3  0 253   3.82  3.48  2.13 
7  leech 1  1 501   5.71  1.83  4.50 
8  leech 2  1 502   5.71  1.83  4.50 
9  leech 3  0 503   5.71  1.83  4.50

来源

2012-12-31 14:26:10

试试这个：

merge(dat1, dat2, by.x = 2, by.y = 0, all.x = TRUE)

这是假设，如果有在dat1是无与伦比的，然后在结果dat2列应该充满NA，如果任何行存在在dat2中是无与伦比的值，那么它们将被忽略。例如：

dat2a <- dat2 
rownames(2a)[3] <- "elephant" 
# the above still works: 
merge(dat1, dat2a, by.x = 2, by.y = 0, all.x = TRUE)

上述已知为留在SQL加入并且可以像这样在sqldf进行（忽略警告）：

library(sqldf) 
sqldf("select * 
     from dat1 left join dat2 
     on dat1.variable = dat2.row_names", 
     row.names = TRUE)

来源

2012-12-31 15:33:42

合并dataframes，不同长度

回答

相关问题