将存储在列表中的变量转换为r中的字符向量列表

我有一个来自非常大的数据集的数据子集。我将这个数据子集分成了一个数据框列表，这样每个case/id就是列表中的一个独立元素。每个元素都用case/id命名。然后，我从每个dataframe元素中删除所有变量，只留下一个变量 - 称为“状态”。它目前是7个级别的因素。将存储在列表中的变量转换为r中的字符向量列表

我试图将这个“状态”元素列表变成一个字符向量列表。下面的元素是列表中的第一个元素，并且包含行号（源自更大的原始数据集）。

[[1]] 
     state 
104246 active 
104247 rest 
104248 active 
104249 active 
. 
. 
. 
104315 active 
104316 active 
104317 rest 
104318 rest

我试图把这个简单地成应该是这样的一个特征向量：

[1] "active" "rest" "active" "active" ........... "active" "active" "rest" "rest"

这似乎很简单。我曾尝试做这样的事情（其中“临时”的列表名称）：

as.vector(as.matrix(temp))

这将返回是这样的：

  [,1] 
    id1 List,1 
    id2 List,1 
    id3 List,1 
    id4 List,1

当我看到每一个元素，从这个他们基本上看起来是仍然长存。

另外，我尝试直接转换为字符：

as.vector(as.character(temp))

但是，这回来为不理想的格式（不过，我想我可以破解这个的因子水平数转换成的话.. （注意在大的数据集，有7个级别的因子“州”的）

[1] "list(state = c(1, 4, 1, 1, 1, 1, 1, 4, 4, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 1, 6, 1, 4, 4, 1, 1, 1, 4,  1, 1, 1, 6, 4, 1, 1, 1, 1, 1, 4, 4, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 4, 4, 1, 1, 1, 1, 4, 4, 1, 1, 1, 1,  1, 1, 1, 4, 4))"

我还试图使变量“状态”，这是一个因素的字符变量转换之前，但没” t help。

以下是一个可重现的例子的数据。它仅包含在这个例子中列表“临时”两个元素：

temp<-list(structure(list(state = structure(c(1L, 4L, 1L, 1L, 1L, 1L, 
              1L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 1L, 
              6L, 1L, 4L, 4L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 6L, 4L, 1L, 1L, 1L, 
              1L, 1L, 4L, 4L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
              4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 
              1L, 4L, 4L), .Label = c("active", "active2", "active3", "rest", "rest2", 
                    "stop", "stop2"), class = "factor")), .Names = "state", row.names = 104246:104318, class = "data.frame"), 
     structure(list(state = structure(c(1L, 4L, 4L, 4L, 1L, 1L, 
              1L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 4L, 1L, 1L, 6L, 4L, 1L, 4L, 
              4L, 4L, 1L, 4L, 1L, 1L, 1L), .Label = c("active", "active2", 
                        "active3", "rest", "rest2", "stop", "stop2"), class = "factor")), .Names = "state", row.names = 950:977, class = "data.frame")) 



str(temp)

来源

2014-07-16 jalapic

L = lapply(temp, function(x) as.character(unlist(x)))只是L[[1]]或L[[2]]的载体。

来源

2014-07-16 03:49:42 Vlo

尝试这段代码

as.vector(unlist(temp[[1]]))

来源

2014-07-16 02:50:50

这可能是一个很好的机会，利用rapply：

x <- rapply(temp, as.character, how = "replace") 
str(x) 
# List of 2 
# $ :List of 1 
# ..$ state: chr [1:73] "active" "rest" "active" "active" ... 
# $ :List of 1 
# ..$ state: chr [1:28] "active" "rest" "rest" "rest" ...

如果您想进一步压平，然后就可以使用unlist(..., recursive = FALSE)。

str(unlist(rapply(temp, as.character, how = "replace"), recursive=FALSE)) 
# List of 2 
# $ state: chr [1:73] "active" "rest" "active" "active" ... 
# $ state: chr [1:28] "active" "rest" "rest" "rest" ...

这第二种方法会给你同样的结果@ VLO的做法，但比它调用unlist只是一次会更有效。要看看它可能有多不同，下面是一些较大的基准list：

x <- replicate(1000, temp) ## A larger list 

## Vlo's approach 
fun1 <- function() { 
    lapply(x, function(y) as.character(unlist(y, use.names = FALSE))) 
} 

## My approach 
fun2 <- function() { 
    unlist(rapply(x, as.character, how = "replace"), 
     recursive=FALSE, use.names=FALSE) 
} 

## Benchmarking 
library(microbenchmark) 
microbenchmark(fun1(), fun2(), times = 50) 
# Unit: milliseconds 
# expr  min  lq median  uq  max neval 
# fun1() 435.84992 475.17146 497.63325 533.68488 1570.6814 50 
# fun2() 50.90449 55.79023 63.85908 70.78956 111.0357 50 

## Comparison of results 
all.equal(fun1(), fun2(), check.attributes=FALSE) 
# [1] TRUE

来源

2014-07-16 04:23:38 A5C1D2H2I1M1N2O1R2T1

将存储在列表中的变量转换为r中的字符向量列表

回答

相关问题