2017-04-12 22 views
0

在R,我使用stat_poly_eq()注释从上图的线性模型方程添加多元回归线方程,R2和SSE,和我遇到的2个问题:在同一个图形

  1. 哪有我注释了三个独立的方程,每个方程组有一个和另一个整个数据?

  2. 我怎么可以在每个 方程添加平方(SSE)的相应的误差总和?

如图所示here,下面的代码产生一个一般方程包括所有数据:

x <- runif(200, 0, 100) 
y <- 5*x + rnorm(200, 0, 10) 
df <- data.frame(x, y) 
df$GENDER[1:100] <- 1 
df$GENDER[101:nrow(df)] <- 2 



formula <- y ~ poly(x, 1, raw = TRUE) 


my_features <- list(scale_shape_manual(values=c(16, 1)), 
        geom_smooth(method = "lm", aes(group = 1), 
           formula = formula, colour = "Black", 
           fill = "grey70"), 
        geom_smooth(method = "lm", aes(group = factor(GENDER), se = F), 
           formula = formula, colour = "Black"), 
        stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~")), 
           formula = formula, parse = TRUE) 
) 


ggplot(df, aes(x = x, y = y, aes(shape = factor(GENDER)))) + 
    geom_point(aes(shape = factor(GENDER))) + 
    my_features 

回答

3

我不得不手动添加在平方误差总和,和定位基于完整数据的方程组。使用下面的方法。

library(ggplot2) 
library(ggpmisc) 

# Get Error Sum of Squares 
sum((lm(y ~ poly(x, 1, raw = TRUE)))$res^2) 
sum(lm(y[df$GENDER == 1] ~ poly(x[df$GENDER == 1], 1, raw = TRUE))$res^2) 
sum(lm(y[df$GENDER == 2] ~ poly(x[df$GENDER == 2], 1, raw = TRUE))$res^2) 


my_features <- list(
    scale_shape_manual(values=c(16, 1)), 
    geom_smooth(method = "lm", aes(group = 1), 
    formula = formula, colour = "Black", fill = "grey70"),         
                 #Added colour 
    geom_smooth(method = "lm", aes(group = factor(GENDER), colour = factor(GENDER)), 
    formula = formula, se = F), 
    stat_poly_eq(
    aes(label = paste(paste(..eq.label.., ..rr.label.., sep = "~~~~"), 
          #Manually add in ESS 
         paste("ESS", c(9333,9622), sep = "=="), 
       sep = "~~~~")), 
    formula = formula, parse = TRUE) 
) 

ggplot(df, aes(x = x, y = y, shape = factor(GENDER), colour = factor(GENDER))) + 
    geom_point(aes(shape = factor(GENDER))) + 
    my_features + 

    #Add in overall line and label 
    geom_smooth(method = "lm", aes(group = 1), colour = "black") + 
    stat_poly_eq(aes(group = 1, label = paste(..eq.label.., ..rr.label.., 'ESS==19405', sep = "~~~~")), 
          formula = formula, parse = TRUE, label.y = 440) 

enter image description here

或者你也可以复制你的数据集,因此整个数据集包含一个因子水平本身......内还是需要手动添加ESS。

x <- runif(200, 0, 100) 
y <- 5*x + rnorm(200, 0, 10) 
df1 <- data.frame(x, y) 
df1$GENDER[1:100] <- 1 
df1$GENDER[101:nrow(df1)] <- 2 

df2 <- df1 
df2$GENDER <- 3 

#Now data with GENDER == 3 is the full data 
df <- rbind(df1, df2) 

my_features <- list(
          #Add another plotting character 
scale_shape_manual(values=c(16, 1, 2)),        
                 #Added colour 
    geom_smooth(method = "lm", aes(group = factor(GENDER), colour = factor(GENDER)), 
    formula = formula, se = F), 
    stat_poly_eq(
    aes(label = paste(paste(..eq.label.., ..rr.label.., sep = "~~~~"), 
          #Manually add in ESS 
         paste("ESS", c(9333,9622,19405), sep = "=="), 
       sep = "~~~~")), 
    formula = formula, parse = TRUE) 
) 

ggplot(df, aes(x = x, y = y, shape = factor(GENDER), group = factor(GENDER), colour = factor(GENDER))) + 
    geom_point(aes(shape = factor(GENDER))) + 
    my_features 

enter image description here

编辑:如果你想删除绘图字符,可得做三组。

my_features <- list(
    geom_smooth(method = "lm", aes(group = factor(GENDER), colour = factor(GENDER)), 
    formula = formula, se = F), 
    stat_poly_eq(
     aes(label = paste(paste(..eq.label.., ..rr.label.., sep = "~~~~"), 
           #Manually add in ESS 
         paste("ESS", c(9333,9622,19405), sep = "=="), 
        sep = "~~~~")), 
     formula = formula, parse = TRUE) 
) 

p <- ggplot(df, aes(x = x, y = y, shape = factor(GENDER), group = factor(GENDER), colour = factor(GENDER))) + 
     my_features 
p + 
    scale_color_manual(labels = c("Male", "Female", "Both"), values = hue_pal()(3)) + 
    geom_point(data = df[df$GENDER == 1,], aes(colour = factor(GENDER)), shape = 16)+ 
    geom_point(data = df[df$GENDER == 2,], aes(colour = factor(GENDER)), shape = 1) + 
    guides(colour = guide_legend(title = "Gender", override.aes = list(shape = NA))) 

enter image description here

+0

虽然第二个选项也是行之有效的方程式,它覆盖的形状,从而导致混乱的情节视觉。 在'stat_plot_eq()'函数中使用更加集成的方法来注释其他信息(如SSE),并避免手动输入它会很好。 – AJMA

+1

@AJMA使用第二种方法,您仍然可以删除新组“GENDER == 3”的绘图字符,它表示我的示例中的完整数据集。我提供了一个显示如何完成的编辑。如果你想添加误差平方和,我认为必须手动完成,除非'stat_plot_eq()'函数被更新。 – Jake

相关问题