我试图用一组简单的代码替换多个单独的Mapply语句。我终于得到它与3嵌套mapply声明,但似乎有点复杂的方法。我是来自其他语言的新手,所以在R心态寻找一些帮助来思考。如果这三个陈述是最好的方法,我可以接受它,但要寻找输入。如果你有更好的方法来构造像这样的子集化输出,那么我就是耳朵。简化嵌套Mapply语句
payments <- data.frame(
Amount = sample(5:15,100,replace=TRUE),
Tip.Amount = round(runif(100,0,2),2),
"A" = sample(c(TRUE,FALSE),100,replace=TRUE),
"B" = sample(c(TRUE,FALSE),100,replace=TRUE),
"C" = sample(c(TRUE,FALSE),100,replace=TRUE),
"D" = sample(c(TRUE,FALSE),100,replace=TRUE),
"E" = sample(c(TRUE,FALSE),100,replace=TRUE),
"F" = sample(c(TRUE,FALSE),100,replace=TRUE),
Date = sample(seq(as.Date("2016-01-01"),as.Date("2016-01-31"),by="day"),100,replace=TRUE)
)
employees <- c("A","B","C","D","E","F")
dots <- lapply(c(employees,"Date"),as.symbol)
payments.by_date_employee <- payments %>%
filter(!is.na(Date),!is.na(Amount)) %>%
group_by_(.dots=dots) %>%
summarise(Payment.Count=n(), Amount=sum(Amount),
Tip.Count=sum(Tip.Amount>=0.01,na.rm=TRUE), Tip.Amount=sum(Tip.Amount,na.rm=TRUE)) %>%
ungroup() %>%
arrange(Date)
#long/manual way--------------------------------------------------------------------------------
t <- list()
t[["payments"]][["amount"]] <- mapply(function(name) list({
t.test(subset(payments,payments[[name]]==TRUE)$Amount,
subset(payments,payments[[name]]==FALSE)$Amount)$p.value
}),
employees)
t[["payments"]][["count"]] <- mapply(function(name) list({
t.test(subset(payments.by_date_employee,payments.by_date_employee[[name]]==TRUE)$Amount,
subset(payments.by_date_employee,payments.by_date_employee[[name]]==FALSE)$Amount)$p.value
}),
employees)
t[["tips"]][["amount"]] <- mapply(function(name) list({
t.test(subset(payments,payments[[name]]==TRUE)$Tip.Amount,
subset(payments,payments[[name]]==FALSE)$Tip.Amount)$p.value
}),
employees)
t[["tips"]][["count"]] <- mapply(function(name) list({
t.test(subset(payments.by_date_employee,payments.by_date_employee[[name]]==TRUE)$Tip.Amount,
subset(payments.by_date_employee,payments.by_date_employee[[name]]==FALSE)$Tip.Amount)$p.value
}),
employees)
#long/manual way--------------------------------------------------------------------------------
#attempt at single mapply statement ------------------------------------------------------------
y <- mapply(function(name,type,variable,df,nm) list({
t.test(subset(eval(df),eval(df)[[name]]==TRUE)[[nm]],
subset(eval(df),eval(df)[[name]]==FALSE)[[nm]])$p.value}),
employees,
c("payments","payments","tips","tips"),
c("amount","count"),
c(quote(payments),quote(payments),quote(payments.by_date_employee),quote(payments.by_date_employee)),
c("Amount","Amount","Tip.Amount","Tip.Amount"),
SIMPLIFY = FALSE
)
#attempt at single mapply statement ------------------------------------------------------------
#works but seems convoluted --------------------------------------------------------------------
z <- mapply(function(type) list({
mapply(function(variable,df,nm) list({
t[[type]][[variable]] <-mapply(function(name) list({
t.test(subset(eval(df),eval(df)[[name]]==TRUE)[[nm]],
subset(eval(df),eval(df)[[name]]==FALSE)[[nm]])$p.value}),
employees)
}),
c("amount","count"),
c(quote(payments),quote(payments),quote(payments.by_date_employee),quote(payments.by_date_employee)),
c("Amount","Amount","Tip.Amount","Tip.Amount"),
SIMPLIFY = FALSE
)
}),
c("payments","tips")
)
#works but seems convoluted --------------------------------------------------------------------
绝对看到它的工作原理。在标记为正确之前尝试理解你所做的事情。你正在做一些对我来说很陌生的事情! – atclaus
您会如何建议从t检验中提取额外的值?我正在寻找x和y的意思,所以我可以总结出存在差异的方向...... – atclaus
请参阅编辑。 –