2011-05-12 66 views
0

我的目标是创建一个表格,概述我的示例中的特色国家。该表格应该只有两行,第一行的每个区域有不同的列,第二行的国家/地区名称位于相应的区域。带表格的国家名称

给你举一个例子,这是我的data.frameXYZ样子:

..................wvs5red2.s003names.....wvs5red2.regiondummies 
21............."Hong Kong"......................Asian Tigers 
45............."South Korea"....................Asian Tigers 
49............."Taiwan".............................Asian Tigers 
66............."China"...............................East Asia & Pacific 
80............."Indonesia"........................East Asia & Pacific 
86............."Malaysia"...........................East Asia & Pacific 

我的目标是获得一个表,类似于这样:

region.............Asian Tigers..............................................East Asia & Pacific 
countries........Hong Kong, South Korea, Taiwan...........China, Indonesia, etc. 

你有什么想法如何获得这样的表?我花了几个小时寻找类似的东西。

+3

您data.frame伤害了我的眼睛。请采取在这里给出的建议:http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – 2011-05-12 13:46:43

+0

@Joris是的,我的太,抱歉,并感谢链接。 – Tobias 2011-05-12 15:24:23

回答

2

首先创建数据:

> country<-c("Hong Kong","Taiwan","China","Indonesia") 
> region<-rep(c("Asian Tigers","East Asia & Pacific"),each=2) 
> df<-data.frame(country=country,region=region) 

然后通过柱运行region并收集所有国家。我们可以使用tapply,但是我将使用来自包plyrdlply,因为它保留了列表名称。

> ll<-dlply(df,~region,function(d)paste(d$country,collapse=",")) 
> ll 
$`Asian Tigers` 
[1] "Hong Kong,Taiwan" 

$`East Asia & Pacific` 
[1] "China,Indonesia" 

attr(,"split_type") 
[1] "data.frame" 
attr(,"split_labels") 
       region 
1  Asian Tigers 
2 East Asia & Pacific 

现在转换为使用do.call的列表中data.frame。由于我们需要很好的名字,我们需要传递参数check.names=FALSE

> ll$check.names <- FALSE 
> do.call("data.frame",ll) 
     Asian Tigers East Asia & Pacific 
1 Hong Kong,Taiwan  China,Indonesia 
+1

'as.data.frame(ll,optional = TRUE)'可能比“在这种情况下do.call'。 – 2011-05-12 14:12:05

+1

@Joshua,是的,但后来OP不会介绍到'do.call' :) – mpiktas 2011-05-12 14:14:45

+0

Touché,好点。 :) – 2011-05-12 14:17:16

3

重新创建数据:

dat <- data.frame(
    country = c("Hong Kong", "South Korea", "Taiwan", "China", "Indonesia", "Malaysia"), 
    region = c(rep("Asian Tigers", 3), rep("East Asia & Pacific", 3)) 
) 
dat 

     country    region 
1 Hong Kong  Asian Tigers 
2 South Korea  Asian Tigers 
3  Taiwan  Asian Tigers 
4  China East Asia & Pacific 
5 Indonesia East Asia & Pacific 
6 Malaysia East Asia & Pacific 

使用ddplypaste组合包plyr汇总数据:

library(plyr) 
ddply(dat, .(region), function(x)paste(x$country, collapse= ",")) 

       region       V1 
1  Asian Tigers Hong Kong,South Korea,Taiwan 
2 East Asia & Pacific  China,Indonesia,Malaysia 
+0

作品也很完美,谢谢。 (mpiktas的解决方案具有显示区域作为列名的最小优势,我只能标记一个答案,最好......) – Tobias 2011-05-12 15:30:41

4

简单的方法是tapply

XYZ <- structure(list(
    names = structure(c(2L, 5L, 6L, 1L, 3L, 4L), .Label = c("China", "Hong Kong", "Indonesia", "Malaysia", "South Korea", "Taiwan"), class = "factor"), 
    region = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Asian Tigers", "East Asia & Pacific"), class = "factor")), 
    .Names = c("names", "region"), row.names = c(NA, -6L), class = "data.frame") 

tapply(XYZ$names, XYZ$region, paste, collapse=", ") 
#      Asian Tigers    East Asia & Pacific 
# "Hong Kong, South Korea, Taiwan"  "China, Indonesia, Malaysia" 
+0

也很好,谢谢,Andrie的解决方案的同样的问题,我只能投票一个最好的答案(和我实现你的解决方案时得到的数组不完全符合从记忆体toLatex包作为mpiktas的代码) – Tobias 2011-05-12 15:39:21