一种可能的解决方案。我不完全确定你想要对字符串中的'x'做什么,我已经将它们保存在连接键中,但通过将\\1\\2
更改为\\1
,您只保留第一个字母。
a <- data.frame(
Name = paste0(".", c("tony", "tom", "foo", "bar", "foobar"), ".x.rds"),
Numbers = rnorm(5)
)
b <- data.frame(
Name = paste0(c("tony", "tom", "bar", "foobar", "company"), ".x"),
ChaR = LETTERS[11:15]
)
# String consists of 'point letter1 point letter2 point rds'; replace by
# 'letter1 letter2'
a$Name_stand <- gsub("^\\.([a-z]+)\\.([a-z]+)\\.rds$", "\\1\\2", a$Name)
# String consists of 'letter1 point letter2'; replace by 'letter1 letter2'
b$Name_stand <- gsub("^([a-z]+)\\.([a-z]+)$", "\\1\\2", b$Name)
result <- merge(a, b, all = TRUE, by = "Name_stand")
输出:
#> result
# Name_stand Name.x Numbers Name.y ChaR
#1 barx .bar.x.rds 1.38072696 bar.x M
#2 companyx <NA> NA company.x O
#3 foobarx .foobar.x.rds -1.53076596 foobar.x N
#4 foox .foo.x.rds 1.40829287 <NA> <NA>
#5 tomx .tom.x.rds -0.01204651 tom.x L
#6 tonyx .tony.x.rds 0.34159406 tony.x K
另一个,也许一定程度上更坚固(到琴弦的变体如“tom.rds”并且将仍然被链接“汤姆”;这当然也可以成为一个缺点)/
# Remove the rds from a$Name
a$Name_stand <- gsub("rds$" , "", a$Name)
# Remove all non alpha numeric characters from the strings
a$Name_stand <- gsub("[^[:alnum:]]", "", a$Name_stand)
b$Name_stand <- gsub("[^[:alnum:]]", "", b$Name)
result2 <- merge(a, b, all = TRUE, by = "Name_stand")
不是我downvote。您想要加入两个数据框的实际情况是什么? –
我想加入列名中的值。其实列名是固定名称,而.a.x.rds等于a。 – runjumpfly
因此,可以肯定地说'.a.x.rds'包含'a.x'就足够了。 –