is.element数据框中的列表列

我有一个数据框，列中包含一些列表元素。我想知道哪一行数据框包含该列中的关键字。is.element数据框中的列表列

的数据帧，DF，看起来有点像这样

idstr    tag 
1     wl 
2   other.to 
3   other.from 
4 c("wl","other.to") 
5     wl 
6   other.wl 
7 c("ll","other.to")

的目标是在他们的标签“WL”至一个新的数据帧分配的所有行。在这个例子中，我想，看起来像一个新的数据帧：

idstr tag 
1  wl 
4  c("wl","other.to") 
5  wl

我想是这样的

df_wl < - DF [其中（is.element（ 'WL'，DF $标签）），]

但这只返回数据帧的第一个元素（不论它是否包含'wl'）。我认为麻烦在于遍历行并实现“is.element”函数。下面是功能的两种实现方式，它的结果：

is.element('wl',df$tag[[4]]) > TRUE 
is.element('wl',df$tag[4]) > FALSE

你怎么建议我通过数据帧迭代来与它的正确赋值df_wl？

PS：这里的dput：根据您的dput数据

structure(list(idstr = 1:7, tag = structure(c(6L, 5L, 4L, 2L, 6L, 3L, 1L), .Label =  c("c(\"ll\",\"other.to\")", "c(\"wl\",\"other.to\")", "other.wl", "other.from", "other.to", "wl"), class = "factor")), .Names = c("idstr", "tag"), row.names = c(NA, -7L), class = "data.frame")

来源

2014-10-27 zebrainatree

什么'DF [sapply（DF $标签，函数（x）的任何（x ==“WL “）），]' – 2014-10-27 19:58:15

您是否尝试过使用'grep'？ – tcash21 2014-10-27 19:59:19

谢谢理查德。它适用于这个小例子，但是当我将它应用于我的主数据集时，它为每个元素返回了一个数据框，其中包含“NA”值。我认为'any（x ==“wl”）'工作，因为新的数据框看起来像合适的大小，所以现在可能是返回数据的问题 – zebrainatree 2014-10-27 20:15:05

。这可能工作。正则表达式匹配(^wl$)|(\"wl\")从wl开始到结束，或"wl"任何发生（双引号括起来）

df[grepl("(^wl$)|(\"wl\")", df$tag),] 
# idstr    tag 
# 1  1     wl 
# 4  4 c("wl","other.to") 
# 5  5     wl

来源

2014-10-27 20:03:31

为什么不''df [grepl（“^ wl $ |'wl'”，df $ tag），]'？正则表达式的第一部分本身是“wl”，第二部分用单引号查找“wl”。 – 2014-10-27 20:19:49

SO CLOSE！我想我没有提供所有的边缘案例。一些列表不能包含像'c（“ll”，“other.from”）的wl。这是另一个dput：结构（列表（idstr = 1：7，tag = structure（c（6L，5L，4L，2L， 6L，3L，1L），.Label = c（“c（\”ll \“， \“other.to \”）“，”c（\“wl \”，\“other.to \”）“， ”ll“，”other.from“，”other.to“，”wl“），class =“factor”）），.Names = c（“idstr”， “tag”），row.names = c（NA，-7L），class =“data.frame”） – zebrainatree 2014-10-27 20:20:03

@BrianDiggs - I was实际上即将发布确切的正则表达式 – 2014-10-27 20:21:22

is.element数据框中的列表列

回答

相关问题