2017-08-22 34 views
0

我需要检测包含特定字符序列的df/tibble的行。R:获取具有特定字符的数据帧行

seq <- "RT @AventusSystems"是我的序列

df <- structure(list(text = c("@AventusSystems Wow, what a upgrade from help of investor", 
"RT @AventusSystems: A recent article about our investors as shown in Forbes! t.co/n8oGwiEDpu #Aventus #GlobalAdvisors #4thefans #Ti…", 
"@AventusSystems Very nice to have this project", "RT @AventusSystems: Join the #TicketRevolution with #Aventus today! #Aventus #TicketRevolution #AventCoin #4thefans t.co/OPlyCFmW4a" 
), Tweet_Id = c("898359464444559360", "898359342952439809", "898359326552633345", 
"898359268226736128"), created_at = structure(c(17396, 17396, 
17396, 17396), class = "Date")), .Names = c("text", "Tweet_Id", 
"created_at"), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame")) 

select(df, contains(seq)) 
# A tibble: 4 x 0 

sapply(df$text, grepl, seq)回报只有4 FALSE

什么我错了吗?什么是正确的解决方案? 谢谢你的帮助

+1

请问'grep的(SEQ,DF $文本)'为你做? – csgroen

+1

或者,如果您想要包含这些字符的数据帧行,请使用“filter(df,grepl(seq,text))' –

+0

@cs groen是的,它的确如此。TY – gabx

回答

2

首先,grepl已经被矢量化为其参数x,所以你不需要sapply。你可以做grepl(seq, df$text)

为什么你的代码不能正常工作是sapply传递X函数参数的每个元素FUN参数作为第一个参数(所以你正在寻找搜索模式“@AventusSystems哇,好从帮助升级

最后的投资者”,等你seq对象,dplyr::select选择列,而要使用dplyr::filter,它过滤行。