合并.csv文件与R

我有3个文件与3个变量：日期，ID和价格。我想按日期来合并它们，因此如果我当前的文件是：合并.csv文件与R

date  ID Price 
01/01/10 A 1 
01/02/10 A 1.02 
01/02/10 A 0.99 
... 
...

我想获得一个合并的文件，看起来像下面的一个ID分别为A，B和C（镨价格面议）：

date  Pr.A Pr.B Pr.C  
01/01/10 1  NA NA 
01/02/10 1.02 1.2 NA 
01/03/10 0.99 1.3 1 
01/04/10 NA  1.23 2 
01/05/10 NA  NA 3

请注意，某些日期没有价格，因此在这种情况下是NA。

我目前的方法可行，但我觉得有点笨拙。

setwd('~where you put the files') 
library(plyr) 
listnames = list.files(pattern='.csv') 
pp1 = ldply(listnames,read.csv,header=T) #put all the files in a data.frame 

names(pp1)=c('date','ID','price') 
pp1$date = as.Date(pp1$date,format='%m/%d/%Y') 

# Reshape data frame so it gets organized by date 
pp1=reshape(pp1,timevar='ID',idvar='date',direction='wide')

有什么更好的方法可以想到吗？

来源

2011-12-07 aatrujillob

转到http://stackoverflow.com/questions/1562124/merge-many-data-frames-from-csv-files –

一个注意 - 链接文件' “a1.csv”'包含了几个额外的用逗号分隔的行没有数据。我手动删除它们，而不是在那里做R代码。 –

我其实觉得你在'reshape'这里做了什么是一个很好的选择。 – joran

看起来像Reduce()工作：

# Read the files in to a single list, removing unwanted second column from each. 
dataDir <- "example" 
fNames <- dir(dataDir) 
dataList <- lapply(file.path(dataDir, fNames), 
        function(X) {read.csv(X, header=TRUE)[-2]}) 

# Merge them     
out <- Reduce(function(x,y) merge(x,y, by=1, all=TRUE), dataList) 

# Construct column names 
names(out)[-1] <- paste("Pr.", toupper(sub("1.csv", "", fNames)), sep="") 
out 
#  date Pr.A Pr.B Pr.C 
# 1 1/1/2010 1.00 NA NA 
# 2 1/2/2010 1.02 1.20 NA 
# 3 1/3/2010 0.99 1.30 1 
# 4 1/4/2010 NA 1.23 2 
# 5 1/5/2010 NA NA 3

其实，你的方法看起来只有精细到我，但我可以看到宁愿在通话的简单性和语法的透明度Reduce。

来源

2011-12-07 07:29:22

如何降低速度方面的表现？ –

@PaulHiemstra：我的猜测并不好（因为它可能会（??）为每个合并操作创建一个新的data.frame）。我不是很清楚，但我会说如果问题的速度有问题，我不会建议'减少'。 –

使用Reduce的有趣选择。现在没有R有这种内置的函数式编程方法。 – LouisChiffre

我没有访问这些文件，我在企业防火墙后面。一旦你建立了data.frame，我会使用cast方法。

res = cast(pp1,date~ID,value="Price",mean)

来源

2011-12-07 12:26:17 LouisChiffre

合并.csv文件与R

回答

相关问题