使用合并语句时重复列

当我尝试将一些代码合并到代码描述时，我得到2个重复列。我开始了这一点：表名：测试使用合并语句时重复列

ID  State 
1  5 
2  2 
3  5

，并希望与该合并它：表名：statecode

StateID State 
5  Mass 
2  NY

做一个表是这样的：

ID State 
1  Mass 
2  NY 
3  Mass

但是，我得到这样的表格：

ID State State 
1  5  Mass 
2  2  NY 
3  5  Mass

我用这样的合并命令：

test = merge(x = test, y = statecode, by.x = "State", by.y = "StateID", all.x = T)

有没有更好的功能以外合并在这种情况下使用？也许只是用州名替换州代码？

非常感谢您的帮助！

来源

2015-05-26 Christopher Yee

'by.y =“StateID”'应该是'by.y =“代码”'如果第二个数据集中的代码是状态ID。 – user227710

感谢您的评论，但是我写错了对不起。我固定他们在原来的虽然！ –

您不得不说要删除哪一列，但您可以使用dplyr简明表示它。

根据你的（但校正的列名）生成的示例数据：

test <- read.table(text = 
"ID StateID 
1  5 
2  2 
3  5", header = TRUE) 

statecode <- read.table(text = 
" 
StateID  State 
5   Mass 
2   NY", header = TRUE)

使用dplyr：

library(dplyr) 
test %>% left_join(statecode, by = "StateID") %>% select(-StateID) 
    ID State 
1 1 Mass 
2 2 NY 
3 3 Mass

来源

2015-05-26 13:41:45

有没有办法纠正列名？我不认为我可以将状态改为StateID，对不起！ –

我也有多个变量的代码描述表。我不知道我是否可以将它们全部重新编码。 –

与'a1 <-merge（test，statecode，by =“StateID”，all.x = TRUE）''和'a1 [， - 1]'类似，不是'select（-StateID'）。为什么我们需要'dplyr'特定的解决方案？ – user227710

的另一种方法与base R：

Pmerge <- function(df1, df2) { 
    res <- suppressWarnings(merge(df1, df2, by.x = "State", by.y = "Code", all.x = T)[,-1]) 
    newdf <- res[order(res$ID),] 
    row.names(newdf) <- 1:nrow(newdf) 
    newdf 
} 

Pmerge(Test, statecode) 
    ID State 
1 1 Mass 
2 2 NY 
3 3 Mass

来源

2015-05-26 14:09:47

使用合并语句时重复列

回答

相关问题