2014-12-05 40 views
1

我不知道我是否能够正确解释它,但我想实现的非常简单。在一个data.frame中查找值并从其他列转移值

这是第一个data.frame。对我来说,重要的值在第一列“V1”

> dput(Data1) 
structure(list(V1 = c(10L, 5L, 3L, 9L, 1L, 2L, 6L, 4L, 8L, 7L 
), V2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA", class = "factor"), 
    V3 = c(18L, 17L, 13L, 20L, 15L, 12L, 16L, 11L, 14L, 19L)), .Names = c("V1", 
"V2", "V3"), row.names = c(NA, -10L), class = "data.frame") 

二data.frame:

> dput(Data2) 
structure(list(Names = c(9L, 10L, 6L, 4L, 2L, 7L, 5L, 3L, 1L, 
8L), Herat = c(30L, 29L, 21L, 25L, 24L, 22L, 28L, 27L, 23L, 26L 
), Grobpel = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L), .Label = "NA", class = "factor"), Hassynch = c(19L, 12L, 
15L, 20L, 11L, 13L, 14L, 16L, 18L, 17L)), .Names = c("Names", 
"Herat", "Grobpel", "Hassynch"), row.names = c(NA, -10L), class = "data.frame" 

) 

从第一data.frame值可以发现第1列,我想复制来自4列(Hassynch)的值并将其放在第一个data.frame的第二列中。

如何以最快的方式做到这一点?

+1

您可以补考重复的例子:http://stackoverflow.com/a/5963610/1412059。但是,我怀疑你想'合并'。 – Roland 2014-12-05 15:30:41

+0

你能告诉我如何用合并函数做到这一点吗? – 2014-12-05 15:38:00

+1

不是没有一个可重复的例子。 – Roland 2014-12-05 15:38:37

回答

1
library(dplyr) 
left_join(Data1, Data2, by=c("V1"="Names")) 
# V1 V2 V3 Herat Grobpel Hassynch 
# 1 10 NA 18 29  NA  12 
# 2 5 NA 17 28  NA  14 
# 3 3 NA 13 27  NA  16 
# 4 9 NA 20 30  NA  19 
# 5 1 NA 15 23  NA  18 
# 6 2 NA 12 24  NA  11 
# 7 6 NA 16 21  NA  15 
# 8 4 NA 11 25  NA  20 
# 9 8 NA 14 26  NA  17 
# 10 7 NA 19 22  NA  13 

# if you don't want V2 and V3, you could 
left_join(Data1, Data2, by=c("V1"="Names")) %>% 
    select(-V2, -V3) 
# V1 Herat Grobpel Hassynch 
# 1 10 29  NA  12 
# 2 5 28  NA  14 
# 3 3 27  NA  16 
# 4 9 30  NA  19 
# 5 1 23  NA  18 
# 6 2 24  NA  11 
# 7 6 21  NA  15 
# 8 4 25  NA  20 
# 9 8 26  NA  17 
# 10 7 22  NA  13 
0

这是一个玩具的例子,我前一段时间来说明merge。来自dplyr的left_join也不错,而data.table几乎肯定有另一种选择。

您可以对参考数据框进行子集化,使其仅包含关键变量和值变量,以便最终不会产生无法管理的数据框。

id<-as.numeric((1:5)) 
m<-c("a","a","a","","") 
n<-c("","","b","b","b") 
dfm<-data.frame(cbind(id,m)) 
head(dfm) 
    id m 
1 1 a 
2 2 a 
3 3 a 
4 4 
5 5 
dfn<-data.frame(cbind(id,n)) 
head(dfn) 
    id n 
1 1 
2 2 
3 3 b 
4 4 b 
5 5 b 

dfm$id<-as.numeric(dfm$id) 
dfn$id<-as.numeric(dfn$id) 

dfm<-subset(dfm,id<4) 
head(dfm) 
    id m 
1 1 a 
2 2 a 
3 3 a 

dfn<-subset(dfn,id!=1 & id!=2) 
head(dfn) 
    id n 
3 3 b 
4 4 b 
5 5 b 

df.all<-merge(dfm,dfn,by="id",all=TRUE) 
head(df.all) 
    id m n 
1 1 a <NA> 
2 2 a <NA> 
3 3 a b 
4 4 <NA> b 
5 5 <NA> b 

df.all.m<-merge(dfm,dfn,by="id",all.x=TRUE) 
head(df.al.lm) 
    id m n 
1 1 a <NA> 
2 2 a <NA> 
3 3 a b 

df.all.n<-merge(dfm,dfn,by="id",all.y=TRUE) 
head(df.all.n) 
    id m n 
1 3 a b 
2 4 <NA> b 
3 5 <NA> b 
相关问题