2016-11-15 85 views
2

我很难理解为什么置信区间不显示在我的数据中。当我在另一个数据集上重现我的代码时,代码似乎正常工作。例如,在mtcarsR - ggplot geom_smooth facet_grid CI未显示

的代码是

mtols = mtcars %>% group_by(am) %>% do(lm0 = lm(disp ~ mpg*gear + vs, data=.)) %>% 
       augment(., lm0) %>% 
       mutate(ymin=.fitted-1.96*.se.fit, ymax=.fitted+1.96*.se.fit) 

要生成的情节

mtols %>% ggplot(aes(mpg, .fitted)) + 
    geom_smooth(data = mtols, aes(mpg, .fitted, group = gear, colour = gear, fill= gear), method="lm") + 
    theme_minimal() + facet_grid(~am) 

我得到的置信区间。

但是,这不适用于我的数据。有人能帮我弄清楚这里出了什么问题吗?我会很感激。

我计算OLS

dt = new %>% group_by(day) %>% do(lm0 = lm(y ~ year*class, data=.)) %>% augment(., lm0) %>% 
    mutate(ymin=.fitted-1.96*.se.fit, ymax=.fitted+1.96*.se.fit) 

dt$year = as.numeric(as.character(dt$year)) 

的情节,(这是少数情况下的例子,但结果是整个数据集相同)

dt %>% ggplot(aes(year, .fitted)) + 
    geom_smooth(data = dt, aes(year, .fitted, group = class, colour = class, fill= class), method="lm") + 
    theme_bw() + facet_grid(~day) 

CI不展示。

enter image description here

任何线索什么,我做错了什么?

奇怪的是,当我不使用facet_grid这里,CI工作完美

enter image description here

dt %>% ggplot(aes(year, .fitted)) + 
    geom_smooth(data = dt, aes(year, .fitted, group = class, colour = class, fill= class), method="lm") + 
    theme_bw() 

我的数据的样本

library(broom) 
library(dplyr) 
library(ggplot2) 

new = structure(list(id = structure(c(844084L, 114510L, 14070410L, 
942483L, 13190105L, 421369L, 301384L, 251789L, 11011210L, 11280408L, 
278575L, 310410L, 16260105L, 11110815L, 18260101L, 14260501L, 
10580L, 15090210L, 19140410L, 13230615L, 246511L, 20040812L, 
14260114L, 287623L, 16090620L, 20131007L, 835743L, 453390L, 395808L, 
363617L), label = "Household identifier", class = c("labelled", 
"integer")), year = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 
2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
2L, 2L, 1L, 1L, 1L, 1L), .Label = c("2000", "2015"), class = "factor"), 
day = c("Weekend", "Weekend", "Weekend", "Weekdays", "Weekdays", 
"Weekend", "Weekdays", "Weekend", "Weekend", "Weekdays", 
"Weekend", "Weekdays", "Weekdays", "Weekend", "Weekend", 
"Weekdays", "Weekdays", "Weekend", "Weekdays", "Weekdays", 
"Weekdays", "Weekend", "Weekend", "Weekend", "Weekend", "Weekend", 
"Weekend", "Weekdays", "Weekdays", "Weekdays"), class = structure(c(1L, 
1L, 2L, 2L, 1L, 2L, 2L, 4L, 2L, 2L, 3L, 2L, 1L, 4L, 1L, 3L, 
2L, 3L, 2L, 4L, 2L, 1L, 3L, 2L, 1L, 4L, 3L, 2L, 4L, 1L), .Label = c("Higher Managerial", 
"Lower Managerial", "Intermediate", "Manual and Routine"), class = "factor"), 
y = c(270, 730, 180, 0, 0, 290, 90, 650, 510, 0, 10, 200, 
200, 180, 0, 0, 140, 260, 110, 740, 260, 0, 390, 610, 0, 
0, 500, 0, 10, 170)), class = "data.frame", row.names = c(NA, 
-30L), .Names = c("id", "year", "day", "class", "y")) 
+0

似乎拖图是不相同的这是一个问题,也是你肯定是'组= class'而不是'组= day'? –

+0

@MamounBenghezal没有'group'是'class',因为我想通过'day'显示'class * year'的交互作用。所以,我希望'facet_grid'分开天的类型。谢谢 – giacomo

+0

该示例会产生错误。错误:'x'和'labels'必须是相同的类型 –

回答

1

的置信区间正在制定。我们无法看到它们,因为每个day只有两个独特点。

dt2 <- dt %>% filter(class == "Higher Managerial") 
plot(.fitted ~ year, data=subset(dt2, day=="Weekend")) 

enter image description here

我们看到的时间间隔不小是因为有更宽的区间时,有四点原因。

enter image description here

当我们不小突破,也有足够的积分,有信心一定范围内。但两点的置信区间没有范围。

confint(lm(.fitted ~ year, data=subset(dt2, day=="Weekdays"))) 
#      2.5 %  97.5 % 
# (Intercept) 9503.333333 9503.333333 
# year   -4.666667 -4.666667 

编辑

这里我们使用ymin和最初计算,ymax,并与geom_ribbon绘制它的一个版本。

dt %>% ggplot(aes(year, .fitted,group = class, colour = class, fill= class)) + 
    geom_line() + 
    geom_ribbon(aes(ymin=ymin, ymax=ymax), alpha=0.2) + 
    theme_bw() + facet_grid(~day) 

enter image description here

+0

是的,你说得对,我只有两天的观察。但理论上我仍然应该能够围绕条件均值绘制“CI”。那么,当周末和周一至周五在一起时,它工作吗?什么可能是一个解决方案? – giacomo

+0

你在说它“不起作用”,但它是。它正在绘制置信区间,它们并不是很宽的区间。 –

+0

对不起,我不明白。 “宽”间隔是什么意思?例如'ggplot(mtcars,aes(am,disp))+ geom_point()+ geom_smooth(method =“lm”)'仍然绘制置信区间,即使'am'只有两个点? – giacomo