2017-05-09 43 views
1

我试图重新格式化一个有四列的数据框。但是,其中一列(dem_profile_description)有大约20个变量,我想将其更改为列。我下载了重塑包。如何使用r中的整形将列值切换为列?

我的数据帧的前几行是这样的:

dem_profile_field dem_profile_description dem_profile_data Community 
dpsf0010042  Female 10 to 14 years(1) 4    Gnar 
dpsf0010043  Female 15 to 19 years(2) 20    Yoke 
dpsf0010044  Female 20 to 24 years(3) 22    Law 
dpsf0010045  Female 25 to 29 years(4) 23    Law 
dpsf0010046  Female 30 to 34 years(5) 24    Ark 
dpsf0010047  Female 35 to 39 years(6) 30    Riverland 

我想这一点:

dem_profile_field Community (1) (2) (3) (4) (5) (6) 
dpsf0010042  Gnar  4 
dpsf0010043  Yoke   20  
dpsf0010044  Law     5 5 
dpsf0010046  Ark      24 
dpsf0010047  Riverland      30 

我的代码是这样的:

library(reshape2) 
census3 <- dcast(census2, "dem_profile_field" + "Community" ~ 
"dem_profile_description", value.var = "dem_profile_data")    

但我结束了这:

dem_profile_field Community dem_profile_description 
1     Community  2 
+0

如果你离开了公式中所有的双引号会发生什么? –

+0

@ 42-我得到这个错误:%name(data)中的value.var%中的错误:找不到对象'dem_profile_data' –

+0

为什么第4行中有两个5? –

回答

2

你基本上没有 - 你只需要排除在dcastformula呼叫报价(你还需要他们为value.var):

census3 <- dcast(census2, dem_profile_field + Community ~ 
        dem_profile_description, value.var = "dem_profile_data") 

为了得到你想要的,你也可以做的名字:

names_to_replace <- grepl("(\\(.*\\))", names(census3)) 
names(census3)[names_to_replace] <- str_extract(names(census3)[names_to_replace], "\\(.*\\)") 
+0

非常感谢!有效!! –

+0

没问题,只需编辑我的代码,以便您可以按照您的方式获取名称 –

0

如果您刚开始使用新的数据转置软件包,则可能需要查看tidyr。语法更直接,并且与'tidyverse'中的其他数据操作包很好地结合在一起。

你的例子就是这样的工作

library(tidyr) 

df <- data.frame(dem_profile_field = 
      c("dpsf0010042", 
      "dpsf0010043", 
      "dpsf0010044", 
      "dpsf0010045", 
      "dpsf0010046", 
      "dpsf0010047"), 
      dem_profile_description = 
      c("Female 10 to 14 years(1)", 
      "Female 15 to 19 years(2)", 
      "Female 20 to 24 years(3)", 
      "Female 25 to 29 years(4)", 
      "Female 30 to 34 years(5)", 
      "Female 35 to 39 years(6)"), 
      dem_profile_data = 
      c(4, 
      20, 
      22, 
      23, 
      24, 
      30), 
      Community = 
      c("Gnar", 
      "Yoke", 
      "Law", 
      "Law", 
      "Ark", 
      "Riverland"), 
      stringsAsFactors = FALSE) 

df_transposed <- df %>% 
    spread(dem_profile_description, dem_profile_data) 
相关问题