我有一个数据帧计数在多个列中的特定值:dplyr,R:在一次
md <- data.frame(a = c(3,5,4,5,3,5), b = c(5,5,5,4,4,1), c = c(1,3,4,3,5,5),
device = c(1,1,2,2,3,3))
myvars = c("a", "b", "c")
md[2,3] <- NA
md[4,1] <- NA
md
我要计数在每列中的5秒数 - 由设备。我能做到这一点是这样的:
library(dplyr)
group_by(md, device) %>%
summarise(counts.a = sum(a==5, na.rm = T),
counts.b = sum(b==5, na.rm = T),
counts.c = sum(c==5, na.rm = T))
然而,在现实生活中我必须吨的变量(的myvars
长度可以非常大) - 所以,我不能指定这些counts.a
,counts.b
等手动 - 几十次。
dplyr
是否允许同时在所有myvars
列上运行5s的计数?
谢谢!
请参阅'?summarise_each'和http://stackoverflow.com/questions/21644848/summarizing-multiple-columns-with-dplyr?rq=1 –
我不知道如何获得那里的名字,但这个作品:'md%>%group_by(device)%>%summarise_each(funs(counts = sum(。== 5,na.rm = TRUE)))' – Frank
@Frank可能是'md%>%group_by(设备)% >%select _(。dots = myvars)%>%summarise_each(funs(counts = sum(。== 5,na.rm = TRUE)))''或者'md%>%group_by(device)%>%summarise_each_ (count = sum(。== 5,na.rm = TRUE)),myvars)' – akrun