2017-02-02 43 views
2

我有一个来自于以下格式的数据帧从第一列移动某些数据,到最后一列的行上方

c1 c2 c3 
1 A 1 D 
2 A 2 D 
3 A 3 D 
4 X 4 D 
5 A 5 D 
6 X 6 D 
7 X 7 D 
8 A 8 D 

我需要让这个与“X”的所有行c1合并到上面行的c3,像下面

c1 c2  c3 
1 A 1   D 
2 A 2   D 
3 A 3  DX4D 
4 A 5 DX6DX7D 
5 A 8   D 

任何想法?

+0

是唯一的'c2'值以及?你总是可以使用'dplyr'到'group_by''c1'和'c2'然后粘贴'c3'列?沿着这些线路的东西? – Jenks

回答

1

既然你不提供你的数据结构,目前还不清楚c3是一个因子还是一个字符串。以防万一,我在处理之前将其转换为字符串。

dat$c3 = as.character(dat$c3) 
for(r in nrow(dat):2) { 
    if(dat[r,1] == "X") { 
     dat[r-1,3] = paste(dat[r-1,3], "X", dat[r,2], dat[r,3], sep="") 
     dat = dat[-r,] 
     } 
} 
dat 
    c1 c2  c3 
1 A 1  D 
2 A 2  D 
3 A 3 DX4D 
5 A 5 DX6DX7D 
8 A 8  D 
1
df <- read.table(text = " c1 c2 c3 
1 A 1 D 
2 A 2 D 
3 A 3 D 
4 X 4 D 
5 A 5 D 
6 X 6 D 
7 X 7 D 
8 A 8 D", stringsAsFactors = FALSE) 

desired_output <- read.table(text = " c1 c2 c3 
1 A 1 D 
2 A 2 D 
3 A 3 DX4D 
4 A 5 DX6DX7D 
5 A 8 D", stringsAsFactors = FALSE) 
rownames(desired_output) <- NULL 

library(dplyr) 
output <- 
df %>% 
    mutate(to_paste = ifelse(c1 == "X", paste0(c1, c2, c3), c3)) %>% 
    group_by(grp = cumsum(c1 == "A")) %>% 
    summarise(c1 = first(c1), c2 = first(c2), c3 = paste0(to_paste, collapse = "")) %>% 
    select(- grp) %>% 
    as.data.frame() 

identical(output, desired_output) 
# [1] TRUE 
1

虽然已经回答了,我想解释一下我的逐步方法:

此我使用的是不同的数据:

# c1 c2 c3 
# A 1 D 
# X 2 D 
# A 3 D 
# X 4 D 
# A 5 D 
# X 6 D 
# X 7 D 
# X 8 D 

y = which(df1$c1=="X")  # which rows are having "X" 
z = cumsum(c(0,diff(y))!=1) # which of those are consecutive 

# for the consecutive rows, paste all the columns data together 
str <- sapply(unique(z), function(i) paste0(unlist(t(df1[y[z == i], ])),collapse = "")) 

# which are the rows just occuring before these X's 
z = unique(y[z])-1 

# substitute the "pasted together" string at the rows just prior to X's 
df1$c3[z] = paste(df1$c3[unique(y[z])-1],str,sep="") 

# subset to have only non-X's rows 
df1[df1$c1!="X",] 

# c1 c2   c3 
#1: A 1  DX2D 
#2: A 3  DX4D 
#3: A 5 DX6DX7DX8D 
相关问题