既然你想在每个状态的基础上tapply
结果可能是你想要的。
为了说明,让我们产生一些任意的数据一起玩:
set.seed(349) # For replication
n <- 20000 # Sample size
gender <- sample(c('M', 'W'), size = n, replace = TRUE) # Random selection of gender
state <- c('AL','AK','AZ','AR','CA','CO','CT','DE','DC','FL','GA','HI',
'ID','IL','IN','IA','KS','KY','LA','ME','MD','MA','MI','MN',
'MS','MO','MT','NE','NV','NH','NJ','NM','NY','NC','ND','OH',
'OK','OR','PA','RI','SC','SD','TN','TX','UT','VT','VA','WA',
'WV','WI','WY') # All US states
state <- sample(state, size = n, replace = TRUE) # Random selection of the states
state_index <- tapply(state, state) # Just for the data generatino part ...
gender_index <- tapply(gender, gender)
# Generate salaries
salary <- runif(length(unique(state)))[state_index] # Make states different
salary <- salary + c(.02, -.02)[gender_index] # Make gender different
salary <- salary + log(50) + rnorm(n) # Add mean and error term
salary <- exp(salary) # The variable of interest
你问,薪金为每个州妇女总和与总工资的每个州的总和是什么:
salary_w <- tapply(salary[gender == 'W'], state[gender == 'W'], sum)
salary_total <- tapply(salary, state, sum)
或者如果它是在一个数据帧:
salary_w <- with(myData, tapply(salary[gender == 'W'], state[gender == 'W'], sum))
salary_total <- with(myData, tapply(salary, state, sum))
那么答案是:
> salary_w/salary_total
AK AL AR AZ CA CO CT DC
0.4667424 0.4877013 0.4554831 0.4959573 0.5382478 0.5544388 0.5398104 0.4750799
DE FL GA HI IA ID IL IN
0.4684846 0.5365707 0.5457726 0.4788805 0.5409347 0.4596598 0.4765021 0.4873932
KS KY LA MA MD ME MI MN
0.5228247 0.4955802 0.5604342 0.5249406 0.4890297 0.4939574 0.4882687 0.5611435
MO MS MT NC ND NE NH NJ
0.5090843 0.5342312 0.5492702 0.4928284 0.5180169 0.5696885 0.4519603 0.4673822
NM NV NY OH OK OR PA RI
0.4391634 0.4380065 0.5366625 0.5362918 0.5613301 0.4583937 0.5022793 0.4523672
SC SD TN TX UT VA VT WA
0.4862358 0.4895377 0.5048047 0.4443220 0.4881062 0.4880047 0.5338397 0.5136393
WI WV WY
0.4787588 0.5495602 0.5029816
相关问题[这里](http://stackoverflow.com/questions/4337170/calculating-subtotals-in-r) – Chase 2010-12-03 02:57:05