在拆分数据帧的列表循环（在尺寸误差）

我有一个非常大的数据集，我已经把它分割成50块所以基本上文件看起来像：文件1 文件2 文件3 。。。 file50（数据帧）在拆分数据帧的列表循环（在尺寸误差）

file_total <- c(file1,...,file50)

我知道这将是合并成一个列表，但我不能使用，因为整个rbind所有数据巨大，plyr库只是需要永远运行

并且在每个文件中，我必须根据1个因子对它们进行分割，将其命名为“id”，然后能够将每个id子集写入.csv文件

到目前为止，我的代码是：

d_split <- split(file1, file1[1]) 

library(plry) 
id <- unlist(lapply(d_split,"[",1,1)) # this returns the unique id 

for (j in seq_along(id)) 
{ 
    write.csv(d_split[[j]], file=paste(id[j], "csv", sep=".")) 
}

这个工程！

但是当我试图把它变成一个又一个for循环它不工作：

for (i in file_total) 
{ 
    d_split <- split(i, i[1]) 
    id <- unlist(lapply(d_split,"[",1,1)) 
    for (j in seq_along(id)) 
    { 
     write.csv(d_split[[j]], file=paste(id[j], "csv", sep=".")) 
    } 
}

它返回以下错误信息：

Error in FUN(X[[1L]], ...) : incorrect number of dimensions

我的意思是我可以做它通过将50个文件复制并粘贴到代码中手动执行，但只是想知道是否有人可以修复我的代码，只需点击一下即可解决问题。

来源

2012-08-25 user1489597

是'file1'，'file2'等每个数据帧吗？ –

问题根据您如何组合数据而发生。相反，他们c相结合，使它们成为一个列表：

file_total <- list(file1,...,file50)

在这一点上，做i in file_total将迭代，你想让它。

作为说明：使用与c数据帧（如我假定file1和file2是）实际上将它们变成向量的列表，而不是数据帧的列表。例如：

file1 = data.frame(x=1:20) 
file2 = data.frame(y=20:40) 
file_total = c(file1, file2) 
# file_total will be: 
# $x 
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
# 
# $y 
# [1] 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

因此，对它们进行迭代实际上会将各个列迭代为向量。但是，使用list来组合它们将允许您迭代数据帧本身：

> list(file1, file2) 
[[1]] 
    x 
1 1 
2 2 
3 3 
4 4 
5 5 
6 6 
7 7 
8 8 
9 9 
10 10 
11 11 
12 12 
13 13 
14 14 
15 15 
16 16 
17 17 
18 18 
19 19 
20 20 

[[2]] 
    y 
1 20 
2 21 
3 22 
4 23 
5 24 
6 25 
7 26 
8 27 
9 28 
10 29 
11 30 
12 31 
13 32 
14 33 
15 34 
16 35 
17 36 
18 37 
19 38 
20 39 
21 40

来源

2012-08-25 01:18:09

甜，谢谢！ – user1489597

在拆分数据帧的列表循环（在尺寸误差）

回答

相关问题