这里的正则表达式的简单方法:
# extract instances (in a list)
strings <- regmatches(testdf$string,
gregexpr("(?<=\")[^\"]+?(?=\"[,)])",
testdf$string, perl = TRUE))
[[1]]
[1] "Kaskazini 'A'" "Kaskazini 'B'"
[[2]]
[1] "Kabale" "Kabare"
[[3]]
[1] "Kisoko" "Kisoro Tc"
[[4]]
[1] "Luwero East" "Luwero West"
[[5]]
[1] "Marindi" "Malindi"
[[6]]
[1] "Mukongoro" "Mukono Tc" "Muko"
# add columns to `testdf`
testdf$first <- sapply(strings, "[", 1)
testdf$second <- sapply(strings, "[", 2)
testdf$third <- sapply(strings, "[", 3)
string first second third
1 c("Kaskazini 'A'", "Kaskazini 'B'") Kaskazini 'A' Kaskazini 'B' <NA>
2 c("Kabale", "Kabare") Kabale Kabare <NA>
3 c("Kisoko", "Kisoro Tc") Kisoko Kisoro Tc <NA>
4 c("Luwero East", "Luwero West") Luwero East Luwero West <NA>
5 c("Marindi", "Malindi") Marindi Malindi <NA>
6 c("Mukongoro", "Mukono Tc", "Muko") Mukongoro Mukono Tc Muko
如果不想手动创建所有列,或者不知道情况的最大数量,你可以用下面的办法:
res <- sapply(seq(max(sapply(strings, length))), function(x)
sapply(strings, "[", x))
cbind(testdf, res)
string 1 2 3
1 c("Kaskazini 'A'", "Kaskazini 'B'") Kaskazini 'A' Kaskazini 'B' <NA>
2 c("Kabale", "Kabare") Kabale Kabare <NA>
3 c("Kisoko", "Kisoro Tc") Kisoko Kisoro Tc <NA>
4 c("Luwero East", "Luwero West") Luwero East Luwero West <NA>
5 c("Marindi", "Malindi") Marindi Malindi <NA>
6 c("Mukongoro", "Mukono Tc", "Muko") Mukongoro Mukono Tc Muko
你能不能说明预期的输出? – Zbynek
'lapply(lapply(as.character(testdf $ string),function(x)eval(parse(text = x))),“[”,c(1,2))''第一和第二*实例*与您的示例数据。 – lukeA
期望的输出是一个新的向量/列,只有列中的向量的名字。 – spesseh