2014-10-04 130 views
8

我有一个数据帧,它由两列组成:一个字符向量col1和一个list列,col2从其他列中删除保留数据帧列的信息

myVector <- c("A","B","C","D") 

myList <- list() 
myList[[1]] <- c(1, 4, 6, 7) 
myList[[2]] <- c(2, 7, 3) 
myList[[3]] <- c(5, 5, 3, 9, 6) 
myList[[4]] <- c(7, 9) 

myDataFrame <- data.frame(row = c(1,2,3,4)) 

myDataFrame$col1 <- myVector 
myDataFrame$col2 <- myList 

myDataFrame 
# row col1   col2 
# 1 1 A 1, 4, 6, 7 
# 2 2 B  2, 7, 3 
# 3 3 C 5, 5, 3, 9, 6 
# 4 4 D   7, 9 

我想不公开我的col2在列表中仍然保持了向量的每个元素存储在col1的信息。用不同的方式来描述它,在常用的数据框整形术语中:“宽”列表栏应转换为“长”格式。

然后在一天结束时,我想要两个长度等于length(unlist(myDataFrame$col2))的向量。在代码:

# unlist myList 
unlist.col2 <- unlist(myDataFrame$col2) 
unlist.col2 
# [1] 1 4 6 7 2 7 3 5 5 3 9 6 7 9 

# unlist myVector to obtain 
# unlist.col1 <- ??? 
# unlist.col1 
# [1] A A A A B B B C C C C C D D 

我想不出任何直接的方式来得到它。

回答

3

这里,这个想法是使用sapply先获取每个列表元素的长度,然后用rep复制col1length

l1 <- sapply(myDataFrame$col2, length) 
    unlist.col1 <- rep(myDataFrame$col1, l1) 
    unlist.col1 
#[1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 

或者通过@Ananda Mahto的建议,上述可还与vapply

with(myDataFrame, rep(col1, vapply(col2, length, 1L))) 
    #[1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 
4

您可以使用“data.table”以展开整个data.frame,并提取感兴趣的列来完成。

library(data.table) 
## expand the entire data.frame (uncomment to see) 
# as.data.table(myDataFrame)[, unlist(col2), by = list(row, col1)] 

## expand and select the column of interest: 
as.data.table(myDataFrame)[, unlist(col2), by = list(row, col1)]$col1 
# [1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 

就R的新版本,现在可以使用,而不是sapply(list, length)方法的lengths功能。 lengths功能相当快。

with(myDataFrame, rep(col1, lengths(col2))) 
# [1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 
15

您也可以使用unnest从包tidyr

library(tidyr) 
unnest(myDataFrame, col2) 

#  row col1 col2 
# (dbl) (chr) (dbl) 
# 1  1  A  1 
# 2  1  A  4 
# 3  1  A  6 
# 4  1  A  7 
# 5  2  B  2 
# 6  2  B  7 
# 7  2  B  3 
# 8  3  C  5 
# 9  3  C  5 
# 10  3  C  3 
# 11  3  C  9 
# 12  3  C  6 
# 13  4  D  7 
# 14  4  D  9 
相关问题