2016-12-19 64 views
0

我的数据如下所示:铸造数据帧

dput(head(dat, 10) 
structure(list(Label = c("Nuclear Blast", "Nuclear Blast", "Nuclear Blast", 
        "Nuclear Blast", "Nuclear Blast", "Nuclear Blast", "Nuclear Blast", 
        "Metal Blade Records", "Metal Blade Records", "Metal Blade Records" 
), Info = c("Germany", " +49 7162 9280-0 ", "active", " N/A ", 
     "1987", "\n\t\t\t\t\t\t\t\t\tAnstalt Records,\t\t\t\t\t\t\t\t\tArctic Serenades,\t\t\t\t\t\t\t\t\tCannibalised Serial Killer,\t\t\t\t\t\t\t\t\tDeathwish Office,\t\t\t\t\t\t\t\t\tEpica,\t\t\t\t\t\t\t\t\tGore Records,\t\t\t\t\t\t\t\t\tGrind Syndicate Media,\t\t\t\t\t\t\t\t\tNuclear Blast America,\t\t\t\t\t\t\t\t\tNuclear Blast Brasil,\t\t\t\t\t\t\t\t\tNuclear Blast Entertainment,\t\t\t\t\t\t\t\t\tRadiation Records,\t\t\t\t\t\t\t\t\tRevolution Entertainment\t\t\t\t\t  ", 
     "Yes", " 5737 Kanan Road #143\n\nAgoura Hills, California 91301 ", 
     "United States", " N/A ")), .Names = c("Label", "Info"), row.names = c(NA, 
                       10L), class = "data.frame") 

如何重塑它,所以它看起来像下面?

Label     Var1   Var2   Var3  Var4 Var5 Var6    Var7 
1 Nuclear Blast  Germany  +49 7162 9280-0  active N/A 1987 Anstalt Records... Yes 
2 Metal Blade Records 5737 Kanan.. United States  N/A 

我认识的行数为每个标签是不一致的,但后来我可以清理一下在Excel或R.

+0

为什么第8行'none'确实出现在第4列第3行的转换数据? – mt1022

+0

看不到图案。请更改预期的输出以符合您的示例 – Sotos

回答

3

试试这个:

library(data.table) 
setDT(dat) 

dat[, Col:= paste0('Var', 1:.N), by='Label'] 

dat = dcast.data.table(dat, Label ~ Col, value.var='Info') 
+0

适用于提供的样本数据,但不包含原始数据。我在原始问题中提供了一个可重现的示例,其中包含'dput'。 – torentino

+0

工程就像一个魅力,谢谢! – torentino

1

下面是使用dplyr/tidyr选项

library(dplyr) 
library(tidyr) 
dat %>% 
    group_by(Label) %>% #group by Label 
    mutate(Col = paste0("Var", row_number())) %>% #create a sequence column 
    spread(Col, Info) #spread to wide format