2015-11-02 82 views
-2

我有一个文件列表,并且我编写了一个函数来处理每个文件并返回两列(“名称”和“值”)。将多个文件合并成一个数据框并将文件名分配给每个文件名

file_list <- list.files(pattern=".txt") 
sample_name <- sub (".*?lvl.(.*?).txt","\\1",file_list) 

for (i in 1:length(file_list)){ 
x<- cleanMyData(file_list[i]) # this function returns a two column data 
#then I want to merge all these processed data into one dataframe. Merge all "value" column based on the "name" column 
# at the same time I want to put the file name in the corresponding column name. I already process the file name and put them into sample_name 
} 

更清楚,这是我的,例如处理数据:

file: apple.txt 
name value 
A  12 
B  13 
C  14 

file: pear.txt 
name value 
A  15 
B  14 
C  20 
D  21 

期望输出:

Apple Pear 
A 12 15 
B 13 14 
C 14 20 
+0

你可能只是'绑定'两个数据帧,但假设这些行排列完全正确。另一种选择是将'name'列上的两个数据框'合并()'。 –

回答

0

你可以尝试

fns <- c("apple.txt", "pear.txt") 
(df <- 
Reduce(function(...) merge(..., all=F), 
     lapply(
     seq(fns), function(x) { 
      read.table(fns[x], 
         header=TRUE, 
         col.names = c("name", 
            tools::file_path_sans_ext(fns)[x])) 
     }) 
) 
) 
# name  apple  pear 
# 1 A  12  15 
# 2 B  13  14 
# 3 C  14  20 

要大写第一个字符,你可以使用事端摹状

sub("\\b(\\w)", "\\U\\1", fns, perl=TRUE) 

(见?sub

为了摆脱name列,你可以使用subset(df, select = -name)

相关问题