2017-09-01 34 views
2

我有一个缺失数据的数据库。我需要推算数据(我使用鼠标),然后基于原始列(使用推算数据)创建新列。正是这些新的列需要我进行统计分析。R - 鼠标 - 添加一列,用于将列与估算值相加

具体来说,我的参与者用7点Likert量表填写了几个问卷。有些人没有回答所有问题。我需要估算值,然后对这些值进行求和,并根据这个总和获得这个新值,以便进行统计分析,根据这个总和将参与者分成“轻度,中度,高度”并将其用于统计分析。

我有什么依据我想这个答案计算器做: Perform operation on each imputed dataset in R's MICE

这里是我的代码(使用R):

# Create a sample bdd 
bdd=data.frame(
    gender=c("M","F","M", "M", "M", "F"), 
    choice=c(1,2,NA,1,1,1), 
    gardes=c(0,0,0,5,7,NA), 
    EE1=c(3,4,1,NA,3,0), 
    EE2=c(2,5,1,3,3,0), 
    EE3=c(3,NA,1,5,3,0), 
    EE4=c(3,6,1,2,3,0), 
    EE5=c(1,4,1,2,3,5), 
    EE6=c(3,1,1,3,3,4), 
    EE7=c(5,0,1,5,3,5), 
    EE8=c(2,6,1,1,3,3), 
    EE9=c(3,4,1,6,3,4) 
    ) 

# Create the additional variable - this will have missing values 
bdd$EE <- bdd$EE1+bdd$EE2+bdd$EE3+bdd$EE4+bdd$EE5+bdd$EE6+bdd$EE7+bdd$EE8+bdd$EE9 

# create ini to get access to meth and pred 
ini <- mice(bdd, max = 0, print = FALSE) 

# Change the method of imputation for EE, so that it always equals bdd$EE1+...+bdd$EE9 
meth1 <- ini$meth 
meth1["EE"] <- "~I(bdd$EE1+bdd$EE2+bdd$EE3+bdd$EE4+bdd$EE5+bdd$EE6+bdd$EE7+bdd$EE8+bdd$EE9)" 

pred1 <- ini$pred 
# change the predictor matrix so only bdd$EE1-9 predicts EE (necessary?) 
pred1[ "EE", ] <- 0 
pred1[ "EE", c("EE1", "EE2", "EE3", "EE4", "EE5", "EE6", "EE7", "EE8", "EE9")] <- 1 
# change the predictor matrix so that EE isnt used to predict 
pred1[ , "EE" ] <- 0 


# Imputations 
imput <- mice(bdd, seed=1, pred = pred1, meth = meth1, m=1, print = FALSE) 

请注意,这是行不通的。任何其他方式来优雅地做到这一点?任何和所有意见的TIA!

编辑补充:这是错误消息我收到的时候我尝试运行这段代码:

Warning messages: 
1: In `[<-.data.frame`(`*tmp*`, , i, value = list(`1` = c(20L, 14L, : 
    replacement element 1 has 456 rows to replace 2 rows 
2: In `[<-.data.frame`(`*tmp*`, , i, value = list(`1` = c(20L, 14L, : 
    replacement element 1 has 456 rows to replace 2 rows 
3: In `[<-.data.frame`(`*tmp*`, , i, value = list(`1` = c(20L, 14L, : 
    replacement element 1 has 456 rows to replace 2 rows 
4: In `[<-.data.frame`(`*tmp*`, , i, value = list(`1` = c(20L, 14L, : 
    replacement element 1 has 456 rows to replace 2 rows 
5: In `[<-.data.frame`(`*tmp*`, , i, value = list(`1` = c(20L, 14L, : 
    replacement element 1 has 456 rows to replace 2 rows 

下面是我对这个问题产生的BDD:

 gender choice gardes EE1 EE2 E3 EE4 EE5 EE6 E7 EE8 EE9 
1  M  1  0 3 2 3 3 1 3 5 2 3 
2  F  2  0 4 5 NA 6 4 1 0 6 4 
3  M  NA  0 1 1 1 1 1 1 1 1 1 
4  M  1  5 NA 3 5 2 2 3 5 1 6 
5  M  1  7 3 3 3 3 3 3 3 3 3 
6  F  1  NA 0 0 0 0 5 4 5 3 4 
+1

嗨,欢迎来到SO,你会补充一点你的缺失值的例子数据?也许只有几行'bdd'?如果没有测试数据,很难去掉代码 – Nate

+0

谢谢Nate - 我马上就会这样做:) – Zephyr

+2

从快速一瞥,不要在插补界限中使用你的数据框调用即改变'“〜我(bdd $ EE1 + bdd $ EE2 ...''到'“〜我(EE1 + EE2 ...' – user20650

回答

1

下面的代码没有错误,在user20650指出的更正之后!

# Create a sample bdd 
bdd=data.frame(
    gender=c("M","F","M", "M", "M", "F"), 
    choice=c(1,2,NA,1,1,1), 
    gardes=c(0,0,0,5,7,NA), 
    EE1=c(3,4,1,NA,3,0), 
    EE2=c(2,5,1,3,3,0), 
    EE3=c(3,NA,1,5,3,0), 
    EE4=c(3,6,1,2,3,0), 
    EE5=c(1,4,1,2,3,5), 
    EE6=c(3,1,1,3,3,4), 
    EE7=c(5,0,1,5,3,5), 
    EE8=c(2,6,1,1,3,3), 
    EE9=c(3,4,1,6,3,4) 
    ) 

# Create the additional variable - this will have missing values 
bdd$EE <- bdd$EE1+bdd$EE2+bdd$EE3+bdd$EE4+bdd$EE5+bdd$EE6+bdd$EE7+bdd$EE8+bdd$EE9 

# create ini to get access to meth and pred 
ini <- mice(bdd, max = 0, print = FALSE) 

# Change the method of imputation for EE, so that it always equals bdd$EE1+...+bdd$EE9 
meth1 <- ini$meth 
meth1["EE"] <- "~I(EE1+EE2+EE3+EE4+EE5+EE6+EE7+EE8+EE9)" 

pred1 <- ini$pred 
# change the predictor matrix so only bdd$EE1-9 predicts EE (necessary?) 
pred1[ "EE", ] <- 0 
pred1[ "EE", c("EE1", "EE2", "EE3", "EE4", "EE5", "EE6", "EE7", "EE8", "EE9")] <- 1 
# change the predictor matrix so that EE isnt used to predict 
pred1[ , "EE" ] <- 0 


# Imputations 
imput <- mice(bdd, seed=1, pred = pred1, meth = meth1, m=1, print = FALSE) 
+0

请注意,我不是100%确定改变预测矩阵的代码是正确的,如果你想使用这个代码,请首先仔细检查,谢谢! – Zephyr