2016-09-17 23 views
1

寻找更好的方法:我如何让R检查多个元素的灵活子集的值(比如说Var2Var3)并将结果写入的检查到一个新的逻辑列?一次检查多个数据帧列(灵活的方式)

有没有比在这里使用row-wise apply()更简洁,更优雅的方式?

df <- read.csv(
    text = '"Var1","Var2","Var3" 
    "","","" 
    "","","a" 
    "","a","" 
    "a","a","a" 
    "a","","a" 
    "","a","" 
    "","","" 
    "","","a" 
    "","a","" 
    "","","a"' 
) 

criticalColumns <- c("Var2", "Var3") 

df$criticalColumnsAreEmpty <- 
    apply(df[, criticalColumns], 1, function(curRow) { 
    return(all(curRow == "")) 
    }) 

我也能做到这一点的一个明确的方式,但是这不是一个灵活的,那么:

df$criticalColumnsAreEmpty <- df$Var2 == "" & df$Var3 == "" 

所需的输出:

Var1 Var2 Var3 criticalColumnsAreEmpty 
            TRUE 
       a     FALSE 
     a      FALSE 
    a a a     FALSE 
    a   a     FALSE 
     a      FALSE 
            TRUE 
       a     FALSE 
     a      FALSE 
       a     FALSE 

回答

1

我们可以在逻辑矩阵使用rowSums

df$criticalColumnsAreEmpty <- !rowSums(df[criticalColumns]!="") 
df$criticalColumnsAreEmpty 
#[1] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE 

或者其他选项(大数据集,以避免转换为矩阵内存的原因)是环比列,检查元素是否为空,使用Reduce&

Reduce(`&`, lapply(df[criticalColumns], function(x) !nzchar(as.character(x))))