R：转换二元分类变量的长期数据格式

mydata <- structure(list(id = 1:10, cafe = c(0, 1, 0, 0, 1, 1, 0, 0, 1, 
1), playground = c(1, 1, 1, 1, 1, 1, 0, 1, 1, 0), classroom = c(0, 
0, 0, 0, 0, 1, 1, 1, 1, 1), gender = structure(c(2L, 2L, 2L, 
2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("Female", "Male"), class = "factor"), 
    job = structure(c(2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("Student", 
    "Teacher"), class = "factor")), .Names = c("id", "cafe", 
"playground", "classroom", "gender", "job"), row.names = c(NA, 
-10L), class = "data.frame") 

> mydata 
    id cafe playground classroom gender  job 
1 1 0   1   0 Male Teacher 
2 2 1   1   0 Male Student 
3 3 0   1   0 Male Teacher 
4 4 0   1   0 Male Student 
5 5 1   1   0 Male Teacher 
6 6 1   1   1 Male Teacher 
7 7 0   0   1 Female Teacher 
8 8 0   1   1 Male Teacher 
9 9 1   1   1 Female Teacher 
10 10 1   0   1 Male Student

我希望的长格式的数据集应该是这样的：R：转换二元分类变量的长期数据格式

id  response gender  job 
1  playground  Male Teacher 
2   cafe  Male Student 
2  playground  Male Student 
3  playground  Male Teacher 
...

从本质上讲，response列对应于网吧，运动场，教室列有一个值1.我已经看过几个例子here和here，但它们不处理二进制数据列。

来源

2017-06-04 Adrian

我们可以使用带有做到这一点tidyverse

library(tidyverse) 
mydata %>% 
    gather(response, value, cafe:classroom) %>% 
    filter(value==1) %>% 
    select(id, response, gender, job)

来源

2017-06-04 19:06:25 akrun

这可以通过使用reshape包中的melt(data, ...)函数来完成。

library(reshape)

首先，我们将要保留的变量指定为列。

id <- c("id", "gender", "job")

然后，我们改变了宽幅长格式，只保留包含1行。

df <- melt(mydata, id=id) 
df[df[,5]==1,-5]

然后，通过id订购数据。

df <- df[order(df[,"id"]),]

最后，我们更改列名并重新排列列。

colnames(df)[4] <- "response" 
df <- df[,c(1,4,2,3)] 

## id response gender job 
## 1 playground Male Teacher 
## 2  cafe Male Student 
## 2 playground Male Student 
## 3 playground Male Teacher 
## ... 
## ... 
## 9 classroom Female Teacher 
## 10  cafe Male Student 
## 10 classroom Male Student

来源

2017-06-04 19:02:59 robbertjan94

R：转换二元分类变量的长期数据格式

回答

相关问题