请帮我解决我的小型项目。将strsplit(...)textvectors拆分为R
有一个大的文本元素列表。每个元素都应该被分成一小段句子。每个小列表应该像原始文本元素一样,作为一个元素保存到相同位置('行')的初始大列表的新列中。
分解标准是"/$"
,"und/KON"
,"oder/KON"
。这应该保留在新的小单元素的头部。
我试过用正则表达式如"/$|und/KON|oder/KON"
和manny组合转义"$"
,"|"
,"/"
。此外,我试图改变参数perl = TRUE
,fixed = TRUE
和FALSE
。每次我尝试注意都会发生。似乎|
解释不正确。你建议如何解决这个问题?
library(stringr) # don't know if it's required
# Input list to be splitted at each
# "/$", "und/KON", "oder/KON"
# but should keep the expression at the start of the next list element
#
# Would be nice but not necessary: The small-list to be named after the ID in the first column
> r <- list(ID=c(01, 02, 03),
elements=c("This should become my first small-list :/$. the first element ,/$, the second element ,/$, and the third element ./$.",
"This should become my second small-list :/$. Element eins und/KON Element zwei oder/KON Element drei ./$.",
"This should become my third small-list :/$. Element Alpha und/KON Element Beta oder/KON Element Gamma ./$.")
# Would look something like
r$small_lists <- sapply(r$elements ,function(x) as.list(strsplit(x,"/$|und/KON"|oder/KON", fixed=TRUE)))
> r$small_lists
$01
[1] "This should become my first small-list "
[2] ":/$. the first element "
[3] ",/$, the second element "
[4] ",/$, and the third element "
[5] "./$."
$02
[1] "This should become my second small-list "
[2] ":/$. Element eins "
[3] "und/KON Element zwei "
[4] "oder/KON Element drei"
[5] "./$."
$03
[1] "This should become my third small-list "
[2] ":/$. Element Alpha "
[3] "und/KON Element Beta "
[4] "oder/KON Element Gamma "
[5] "./$."
> class(r)
[1] "list"
> class(r$small_lists)
[1] "list"
我没有看到一个问题在这里了。 – A5C1D2H2I1M1N2O1R2T1
@AnandaMahto:对不起,谢谢,完成:) – alex
谢谢!)为了让我更好的理解,你能解释一下''&^ \\ 1“'分别是什么'”^&*“'工作? – alex