2015-09-21 53 views
2

例如,我想创建如下所示的字符串,如第一次选举所示。基于Integer的连接列字符串

DF:

Name   Party  FirstElected 
Bob   Liberal  1985 
Joe   Republican 1985 
Sarah  Green  1980 
Bill  Libertarian 1980 
Tom  Conservative 1987 

目标:

Year   PeopleElected 
1985 "Bob (Liberal); Joe (Republican)" 
1980 "Sarah (Green); Bill (Libertarian)" 
1987 "Tom (Conservative)" 

我承担的pasteapply/aggregate一些组合可以做到这一点......但我还没有多少运气至今。

+0

认为大可不必为正则表达式。 –

回答

3

我们可以使用paste/sprintf创建按'FirstElected'分组的格式。我们将'data.frame'转换为'data.table'(setDT(df1)),按'FirstElected'分组,我们用括号将'Party'包装起来,使用sprintf连接'Name',然后用pastecollapse='; '创建单个字符串。

library(data.table) 
setDT(df1)[,list(PeopleElected=paste(sprintf('%s (%s)', 
       Name, Party), collapse="; ")) , by = FirstElected] 
# FirstElected      PeopleElected 
#1:   1985 Bob (Liberal); Joe (Republican) 
#2:   1980 Sarah (Green); Bill (Libertarian) 
#3:   1987    Tom (Conservative) 

或使用单一paste

setDT(df1)[, list(PeopleElected=paste(Name, ' (', Party, ')', 
      sep='', collapse='; ')) , by=FirstElected] 
+1

一如既往的有益和教育,akrun。非常感谢。 – lnNoam

2

而一个dplyr方法(因为我不说话data.table还)

df1 <- data.frame(Name = c("Bob", "Joe", "Sarah", "Bill", "Tom"), 
        Party = c("Liberal", "Republican", "Green", "Libertarian", 
          "Conservative"), 
        FirstElected = c(1985, 1985, 1980, 1980, 1987)) 

df1 %>% 
    group_by(FirstElected) %>% 
    summarise(PeopleElected = paste0(paste0(Name, " (", Party, ")"), 
            collapse = "; ")) 

Source: local data frame [3 x 2] 

    FirstElected      PeopleElected 
     (dbl)        (chr) 
1   1980 Sarah (Green); Bill (Libertarian) 
2   1985 Bob (Liberal); Joe (Republican) 
3   1987    Tom (Conservative)