独立逗号分隔单元格为新行

嗨我有一个逗号分隔列的表，我需要将逗号分隔值转换为新行。对于〔实施例中给定的表是独立逗号分隔单元格为新行

Name  Start  End 
A  1,2,3 4,5,6 
B   1,2  4,5 
C  1,2,3,4 6,7,8,9

我需要将其转换像

Name Start End 
    A  1 4 
    A  2 5 
    A  3 6 
    B  1 4 
    B  2 5 
    C  1 6 
    C  2 7 
    C  3 8 
    C  4 9

我能做到这一点使用VB脚本，但我需要它，使用R 谁能解决这个问题要解决？

来源

2011-02-09 Jana

这个问题属于的R语言的服务器上，而不是在这里。 – whuber

用read.csv读取文件，并用write.table写入，用熔化转换为适当的格式。这个问题类似于你最近的问题，它真的不属于这里。请通过stackoverflow询问。 – mpiktas

哦，好的。我是新来的R认为CV是一个R社区。下次还会跟着:) – Jana

以下是一种适合您的方法。我假设你的三个输入向量在不同的对象中。我们将创建这些输入的列表并编写一个处理每个对象的函数，并以plyr的形式以data.frame的形式返回它们。

这里需要注意的是将字符向量拆分为它的组成部分，然后使用as.numeric将字符形式的数字在拆分时进行转换。由于R按列填充矩阵，因此我们定义一个2列矩阵，让R为我们填充值。然后，我们检索名称列，并将它们放在data.frame中。 plyr已经足够好处理列表并将其自动转换为data.frame。

library(plyr) 

a <- paste("A",1, 2,3,4,5,6, sep = ",", collapse = "") 
b <- paste("B",1, 2,4,5, sep = ",", collapse = "") 
c <- paste("C",1, 2,3,4,6,7,8,9, sep = ",", collapse = "") 

input <- list(a,b,c) 

splitter <- function(x) { 
    x <- unlist(strsplit(x, ",")) 
    out <- data.frame(x[1], matrix(as.numeric(x[-1]), ncol = 2)) 
    colnames(out) <- c("Name", "Start", "End") 
    return(out) 
} 


ldply(input, splitter)

和输出：

> ldply(input, splitter) 
Name Start End 
1 A  1 4 
2 A  2 5 
3 A  3 6 
4 B  1 4 
5 B  2 5 
6 C  1 6 
7 C  2 7 
8 C  3 8 
9 C  4 9

来源

2011-02-09 19:26:27 Chase

你可能会问这个问题上SO因为没有处理统计:)

反正问题，我做了一个相当复杂和丑陋解决方案，可能适合你：

# load your data 
x <- structure(list(Name = c("A", "B", "C"), Start = c("1,2,3", "1,2", 
"1,2,3,4"), End = c("4,5,6", "4,5", "6,7,8,9")), .Names = c("Name", 
"Start", "End"), row.names = c(NA, -3L), class = "data.frame")

看起来像R中，如：

data <- data.frame(cbind(
    rep(x$Name,as.numeric(lapply(strsplit(x$Start,","), length))), 
    unlist(lapply(strsplit(x$Start,","), cbind)), 
    unlist(lapply(strsplit(x$End,","), cbind)) 
    ))

命名新的数据帧：

> x 
    Name Start  End length 
1 A 1,2,3 4,5,6  3 
2 B  1,2  4,5  2 
3 C 1,2,3,4 6,7,8,9  4

与strsplit呼叫帮助数据转换

names(data) <- c("Name", "Start", "End")

它看起来像：

> data 
    Name Start End 
1 A  1 4 
2 A  2 5 
3 A  3 6 
4 B  1 4 
5 B  2 5 
6 C  1 6 
7 C  2 7 
8 C  3 8 
9 C  4 9

来源

2011-02-09 19:29:49 daroczig

“我制定了一个相当复杂和丑陋的解决方案，可能适合你”谢谢你让我微笑。 ;-) –

@Joshua Ulrich：我很高兴我的回答有很好的效果:) – daroczig

这里另一个，只是为了好玩。以d作为原始数据。

f <- function(x, ul = TRUE) 
{ 
    x <- deparse(substitute(x)) 
    if(ul) unlist(strsplit(d[[x]], ',')) 
    else strsplit(d[[x]], ',') 
} 

> data.frame(Name = rep(d$Name, sapply(f(End, F), length)), 
      Start = f(Start), End = f(End)) 
# Name Start End 
# 1 A  1 4 
# 2 A  2 5 
# 3 A  3 6 
# 4 B  1 4 
# 5 B  2 5 
# 6 C  1 6 
# 7 C  2 7 
# 8 C  3 8 
# 9 C  4 9

来源

2014-05-23 05:19:41

的separate_rows()功能tidyr是与多个分隔值观察老板......

# create data 
library(tidyverse) 
d <- data_frame(
    Name = c("A", "B", "C"), 
    Start = c("1,2,3", "1,2", "1,2,3,4"), 
    End = c("4,5,6", "4,5", "6,7,8,9") 
) 
d 
# # A tibble: 3 x 3 
# Name Start  End 
# <chr> <chr> <chr> 
# 1  A 1,2,3 4,5,6 
# 2  B  1,2  4,5 
# 3  C 1,2,3,4 6,7,8,9 

# tidy data 
separate_rows(d, Start, End) 
# # A tibble: 9 x 3 
# Name Start End 
# <chr> <chr> <chr> 
# 1  A  1  4 
# 2  A  2  5 
# 3  A  3  6 
# 4  B  1  4 
# 5  B  2  5 
# 6  C  1  6 
# 7  C  2  7 
# 8  C  3  8 
# 9  C  4  9 

# use convert set to TRUE for integer column modes 
separate_rows(d, Start, End, convert = TRUE) 
# # A tibble: 9 x 3 
# Name Start End 
# <chr> <int> <int> 
# 1  A  1  4 
# 2  A  2  5 
# 3  A  3  6 
# 4  B  1  4 
# 5  B  2  5 
# 6  C  1  6 
# 7  C  2  7 
# 8  C  3  8 
# 9  C  4  9

来源

2017-06-15 07:00:25 gjabel

独立逗号分隔单元格为新行

回答

相关问题