2012-07-30 86 views
5

假设我们有看起来像重塑一个数据帧---改变行列

set.seed(7302012) 

county   <- rep(letters[1:4], each=2) 
state   <- rep(LETTERS[1], times=8) 
industry  <- rep(c("construction", "manufacturing"), 4) 
employment  <- round(rnorm(8, 100, 50), 0) 
establishments <- round(rnorm(8, 20, 5), 0) 

data <- data.frame(state, county, industry, employment, establishments) 

    state county  industry employment establishments 
1  A  a construction  146    19 
2  A  a manufacturing  110    20 
3  A  b construction  121    10 
4  A  b manufacturing   90    27 
5  A  c construction  197    18 
6  A  c manufacturing   73    29 
7  A  d construction   98    30 
8  A  d manufacturing  102    19 

我们想重塑这个数据帧,使每一行代表一个(状态)县,而不是县产业,列construction.employment,construction.establishments,和类似的制造版本。什么是有效的方法来做到这一点?

一种方法是子集

construction <- data[data$industry == "construction", ] 
names(construction)[4:5] <- c("construction.employment", "construction.establishments") 

同样地,对于制造,然后做一个合并。如果只有两个行业,这并不是那么糟糕,但想象一下有14个行业;这个过程会变得单调乏味(尽管通过在industry的级别上使用for循环来减少这个过程)。

还有其他想法吗?

回答

7

这可以在基础R重塑完成,如果我正确地理解你的问题:

reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry") 
# state county employment.construction establishments.construction 
# 1  A  a      146       19 
# 3  A  b      121       10 
# 5  A  c      197       18 
# 7  A  d      98       30 
# employment.manufacturing establishments.manufacturing 
# 1      110       20 
# 3      90       27 
# 5      73       29 
# 7      102       19 
4

而且使用重塑包:

library(reshape) 
m <- reshape::melt(data) 
cast(m, state + county~...) 

产量:

> cast(m, state + county~...) 
    state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments 
1  A  a      146       19      110       20 
2  A  b      121       10      90       27 
3  A  c      197       18      73       29 
4  A  d      98       30      102       19 

我亲自使用基础重塑,所以我可能应该用reshape2(韦翰)显示这个,但忘记了有一个reshape2包。稍有不同:

library(reshape2) 
m <- reshape2::melt(data) 
dcast(m, state + county~...) 
+0

啊,好的,我用'.'代替'...',所以它不工作。谢谢! – Charlie 2012-07-30 17:06:41