r分割值

列

我有这个数据帧r分割值

d    f   
"first tweet" A 
"second tweet" B 
"thrid tweet" C

而且我想获得这个

d    A  B  C   
"first tweet" 1  0  0 
"second tweet" 0  1  0 
"thrid tweet" 0  0  1

谢谢！

来源

2014-07-07 JohnCoene

有没有什么真正的原因需要它？ R倾向于将因素选择为虚拟变量样式编码。 – MrFlick

这里有几个问题需要考虑：

model.matrix

cbind(mydf, model.matrix(~ 0 + f, data = mydf)) 
#    d f fA fB fC 
# 1 first tweet A 1 0 0 
# 2 second tweet B 0 1 0 
# 3 thrid tweet C 0 0 1

table

cbind(mydf, as.data.frame.matrix(table(sequence(nrow(mydf)), mydf$f))) 
#    d f A B C 
# 1 first tweet A 1 0 0 
# 2 second tweet B 0 1 0 
# 3 thrid tweet C 0 0 1

dcast从 “reshape2”

library(reshape2) 
dcast(mydf, d ~ f, value.var="f", fun.aggregate=length) 
#    d A B C 
# 1 first tweet 1 0 0 
# 2 second tweet 0 1 0 
# 3 thrid tweet 0 0 1

请注意，前两个选项和第三个选项之间存在差异。如果在列“d”出现重复的值时，第三个选项将会折叠（和列表）值，而前两个选项将逐行分割值。

来源

2014-07-07 04:04:28 A5C1D2H2I1M1N2O1R2T1

另一种可能性：

library(qdap) 
mtabulate(split(dat[[2]], dat[[1]])) 

##    A B C 
## first tweet 1 0 0 
## second tweet 0 1 0 
## thrid tweet 0 0 1

来源

2014-07-07 04:13:17

一个非常简单的表看起来像它可能做的伎俩。

> d <- data.frame(d = c("first tweet", "second tweet", "third tweet"), 
        f = c("A", "B", "C")) 
> tab <- table(d) 
> data.frame(d = rownames(tab), tab[,1:3], row.names = NULL) 
#    d A B C 
# 1 first tweet 1 0 0 
# 2 second tweet 0 1 0 
# 3 third tweet 0 0 1

来源

2014-07-07 04:22:56

嗨@Richard，确实有可能使表成为data.frame？ – useR

@useR，这就是我的答案：'as.data.frame.matrix'。 – A5C1D2H2I1M1N2O1R2T1

@AnandaMahto谢谢!! – useR

回答

相关问题