我有一个实际的问题时,我有两个(或更多)数据帧和要指派的唯一标识号内的每个每个匹配的观察和跨两个数据集如:
如何将相同的唯一ID分配给r中两个数据框之间的匹配观察值?
#1. Create dataframe df1:
a1 <- c(1, 1, 1, 1, 2, 2, 2, 2, 1, 1)
b1 <- c(1, 5, 3, 2, 3, 4, 5, 1, 5, 2)
c1 <- c("white", "red", "black", "white", "red",
"white", "black", "silver", "red", "green")
df1 <- data.frame(a1, b1, c1)
df1
a1 b1 c1
1 1 1 white
2 1 5 red
3 1 3 black
4 1 2 white
5 2 3 red
6 2 4 white
7 2 5 black
8 2 1 silver
9 1 5 red
10 1 2 green
#2. Create dataframe df2:
a2 <- c(2, 2, 1, 1, 2, 2, 2, 2, 2, 2)
b2 <- c(3, 1, 3, 2, 1, 3, 4, 5, 3, 5)
c2 <- c("black", "blue", "black", "white", "silver",
"green", "green", "red", "blue", "white")
df2 <- data.frame(a2, b2, c2)
df2
a2 b2 c2
1 2 3 black
2 2 1 blue
3 1 3 black
4 1 2 white
5 2 1 silver
6 2 3 green
7 2 4 green
8 2 5 red
9 2 3 blue
10 2 5 white
#3. Assign unique IDs to each observation in df1:
library(data.table)
df1.2 <- data.table(df1, key="a1,b1,c1")
df1.2[, id:=.GRP, by=key(df1.2)]
df1.2 <- as.data.frame(df1.2)
df1.2
a1 b1 c1 id
1 1 1 white 1
2 1 2 green 2
3 1 2 white 3
4 1 3 black 4
5 1 5 red 5
6 1 5 red 5
7 2 1 silver 6
8 2 3 red 7
9 2 4 white 8
10 2 5 black 9
#4. The problematic part!! Assign identical unique IDs to matching observations of df2 as compared to df1.2
#and assign other unique IDs to all other non-matching obs of df2.
#Name the resulting dataframe as df2.2
#My expected result will ideally look as follows:
df2.2
a2 b2 c2 id
1 2 3 black 10
2 2 1 blue 11
3 1 3 black 4
4 1 2 white 3
5 2 1 silver 6
6 2 3 green 12
7 2 4 green 13
8 2 5 red 14
9 2 3 blue 15
10 2 5 white 16
任何帮助的如何获得df2.2将非常感激。谢谢。
谢谢。这其实很聪明! – simtim
它很有效!请注意这是*非常重要*这里'paste'有一个空格作为默认分隔符。如果有人说'粘贴(...,sep =“”)'那么哈希将产生冲突。 (“12”,“3”,“黑色”)会得到与(“1”,“23”,“black”)相同的id。请注意分隔符! (为了方便起见,在最后的结果中还添加了'ungroup()'以删除'rowwise()'分组)。 – akhmed