具有广泛的数据的数据帧上卡方检验

我有一个看起来像这样的数据：具有广泛的数据的数据帧上卡方检验

ID gamesAlone gamesWithOthers gamesRemotely tvAlone tvWithOthers tvRemotely 
1 1             1 
2        1      1 
3        1    1 
4        1    1 
5        1      1 
6        1    1 
7        1    1 
8    1          1 
9 1                 1

我想代码，可以做以下两件事情：

首先，变换这像这样整齐的列联表：

 Alone WithOthers Remotely 
games 2  1   6 
tv  4  4   1

其次，使用卡方，看看这些活动（游戏v电视）在他们的社会背景不同。

这是代码来生成数据帧：

data<-data.frame(ID=c(1,2,3,4,5,6,7,8,9), 
      gamesAlone=c(1,NA,NA,NA,NA,NA,NA,NA,1), 
      gamesWithOthers=c(NA,NA,NA,NA,NA,NA,NA,1,NA), 
      gamesRemotely=c(NA,1,1,1,1,1,1,NA,NA), 
      tvAlone=c(NA,NA,1,1,NA,1,1,NA,NA), 
      tvWithOthers=c(1,1,NA,NA,1,NA,NA,1,NA), 
      tvRemotely=c(NA,NA,NA,NA,NA,NA,NA,NA,1))

来源

2017-08-08 mob

略去第一列ID（[-1]），然后取每个列的总和（colSums），而除去NA值（na.rm=TRUE），并将得到的长度为6的矢量放入具有2行的矩阵中。如果需要，还可以相应地标注矩阵尺寸（参数为dimnames）：

m <- matrix(
    colSums(data[-1], na.rm=T), 
    nrow=2, byrow=T, 
    dimnames = list(c("games", "tv"), c("alone", "withOthers", "remotely")) 
) 
m 
#  alone withOthers remotely 
# games  2   1  6 
# tv  4   4  1 
chisq.test(m) 
# 
# Pearson's Chi-squared test 
# 
# data: m 
# X-squared = 6.0381, df = 2, p-value = 0.04885

来源

2017-08-08 07:45:30 lukeA

这将让你在应急表中，你给的形式。建议：请拨打data1而不是data以避免混淆。

library(dplyr) 
library(tidyr) 
data1_table <- data1 %>% 
    gather(key, value, -ID) %>% 
    mutate(activity = ifelse(grepl("^tv", key), substring(key, 1, 2), substring(key, 1, 5)), 
     context = ifelse(grepl("^tv", key), substring(key, 3), substring(key, 6))) %>% 
    group_by(activity, context) %>% 
    summarise(n = sum(value, na.rm = TRUE)) %>% 
    ungroup() %>% 
    spread(context, n) 

# A tibble: 2 x 4 
    activity Alone Remotely WithOthers 
* <chr> <dbl> <dbl>  <dbl> 
1 games  2  6   1 
2  tv  4  1   4

对于卡方：它取决于您想要比较的内容，我假设您的实际数据具有更高的计数。你可以管一大堆进入chisq.test这样的，但我不认为这是非常丰富：

data1_table %>% 
    select(2:4) %>% 
    chisq.test()

来源

2017-08-08 06:02:53 neilfws

具有广泛的数据的数据帧上卡方检验

回答

相关问题