2012-04-30 91 views
0

我有一个长格式的数据集,并希望使用Reshape或Reshape之前的任何预处理将其转换为宽格式。难点在于“价值”变量是非数字的。请注意,原始数据中也有合法的重复记录。以下代码显示每个数据的布局。Reshape中的“聚合”非数字变量

id = c(1, 1, 1, 1, 1, 1, 1) 
month <- c("jan", "feb", "feb", "march", "april", "april", "april") 
stress <- c("mild", "mild", "high", "moderate", "mild", "high", "mild") 
Longdata <- data.frame(id, month, stress, stringsAsFactors = FALSE) 

这是原单格式:

> Longdata 
    id month stress 
1 1 jan  mild 
2 1 feb  mild 
3 1 feb  high 
4 1 march moderate 
5 1 april  mild 
6 1 april  high 
7 1 april  mild 

这是我想怎么组织起来的数据:

id <- c(1) 
jan <- c("mild") 
feb <- c("mild-high") 
march <- c("moderate") 
april <- c("mild-high-mild") 
widedata <- data.frame(id, jan, feb, march, april, stringsAsFactors = FALSE) 
> widedata 
    id jan  feb march   april 
1 1 mild mild-high moderate mild-high-mild 

回答

0

您可以分两步做到这一点,首先使用aggregate,第二次使用“reshape2”包中的R reshapedcast

  1. 聚集步骤:

    Mediumdata <- aggregate(stress ~ id + month, Longdata, paste, collapse="-") 
    Mediumdata 
    # id month   stress 
    # 1 1 april mild-high-mild 
    # 2 1 feb  mild-high 
    # 3 1 jan   mild 
    # 4 1 march  moderate 
    
  2. 的成形步骤:

    # Using base R reshape 
    reshape(Mediumdata, direction="wide", idvar="id", timevar="month") 
    # id stress.april stress.feb stress.jan stress.march 
    # 1 1 mild-high-mild mild-high  mild  moderate 
    
    # Using `dcast` from "reshape2" 
    dcast(mediumdata, id ~ month, value.var="stress") 
    # id   april  feb jan march 
    # 1 1 mild-high-mild mild-high mild moderate