我有一个来自于以下格式的数据帧从第一列移动某些数据,到最后一列的行上方
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 D
4 X 4 D
5 A 5 D
6 X 6 D
7 X 7 D
8 A 8 D
我需要让这个与“X”的所有行c1
合并到上面行的c3
,像下面
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
4 A 5 DX6DX7D
5 A 8 D
任何想法?
我有一个来自于以下格式的数据帧从第一列移动某些数据,到最后一列的行上方
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 D
4 X 4 D
5 A 5 D
6 X 6 D
7 X 7 D
8 A 8 D
我需要让这个与“X”的所有行c1
合并到上面行的c3
,像下面
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
4 A 5 DX6DX7D
5 A 8 D
任何想法?
既然你不提供你的数据结构,目前还不清楚c3是一个因子还是一个字符串。以防万一,我在处理之前将其转换为字符串。
dat$c3 = as.character(dat$c3)
for(r in nrow(dat):2) {
if(dat[r,1] == "X") {
dat[r-1,3] = paste(dat[r-1,3], "X", dat[r,2], dat[r,3], sep="")
dat = dat[-r,]
}
}
dat
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
5 A 5 DX6DX7D
8 A 8 D
df <- read.table(text = " c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 D
4 X 4 D
5 A 5 D
6 X 6 D
7 X 7 D
8 A 8 D", stringsAsFactors = FALSE)
desired_output <- read.table(text = " c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
4 A 5 DX6DX7D
5 A 8 D", stringsAsFactors = FALSE)
rownames(desired_output) <- NULL
library(dplyr)
output <-
df %>%
mutate(to_paste = ifelse(c1 == "X", paste0(c1, c2, c3), c3)) %>%
group_by(grp = cumsum(c1 == "A")) %>%
summarise(c1 = first(c1), c2 = first(c2), c3 = paste0(to_paste, collapse = "")) %>%
select(- grp) %>%
as.data.frame()
identical(output, desired_output)
# [1] TRUE
虽然已经回答了,我想解释一下我的逐步方法:
此我使用的是不同的数据:
# c1 c2 c3
# A 1 D
# X 2 D
# A 3 D
# X 4 D
# A 5 D
# X 6 D
# X 7 D
# X 8 D
y = which(df1$c1=="X") # which rows are having "X"
z = cumsum(c(0,diff(y))!=1) # which of those are consecutive
# for the consecutive rows, paste all the columns data together
str <- sapply(unique(z), function(i) paste0(unlist(t(df1[y[z == i], ])),collapse = ""))
# which are the rows just occuring before these X's
z = unique(y[z])-1
# substitute the "pasted together" string at the rows just prior to X's
df1$c3[z] = paste(df1$c3[unique(y[z])-1],str,sep="")
# subset to have only non-X's rows
df1[df1$c1!="X",]
# c1 c2 c3
#1: A 1 DX2D
#2: A 3 DX4D
#3: A 5 DX6DX7DX8D
是唯一的'c2'值以及?你总是可以使用'dplyr'到'group_by''c1'和'c2'然后粘贴'c3'列?沿着这些线路的东西? – Jenks