2015-05-01 20 views
0

说我有两个数据集的装置,我要绘制在GGPLOT2与误差条barplots彼此相邻,或碱侧由侧R Barplot用误差棒

每个数据集包括数字矩阵的

10 20 12 
10 20 12 
10 20 12 

,然后将其转化为例如3个元素的平均值矢量

10 20 12 

我想要做的是采取两个均值向量并绘制它们作为一个条形图,其中t他的第一个元素除了另一个的第一个元素外

Dataset1Element1Bar-Dataset2Element1Bar Dataset1Element2Bar-Dataset2Element2Bar etc 

给每个条形成一个错误栏,例如标准偏差。我知道我可以通过SD计算,但我不知道怎么用元数它粘成的图形适当形式

最后一点颜色他们(即元1)

我的代码做一个数据集,但我不知道该从哪里去。

result<-barplot(bardata, main="Mean Coverage", names.arg=namePosTargetGroup, ylab="mean Magnitude", cex.names=.4,col=c("red","blue","green")) 
      legend(10,legend=c("Group1","Group2","Group3"),fill = c("red","blue","green")) 

很多我仰望的东西给出了这个或那个东西的答案,但很难弄清楚如何将它们结合在一起。

回答

1

我通常不会推荐绘制带有误差条的条形图。还有许多其他方式来绘制您的数据,这些数据及其结构显示得更好。

特别是如果您只有极少数情况下,绘图方式与酒吧并不好。一个很好的解释可以在这里找到:Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm

我觉得很难给你一个很好的解决方案,因为我不知道你的研究问题。知道你真正想要展示或强调会让事情变得更容易。

我会给你两个建议,一个是小数据集,一个是大数据集。所有这些都是用ggplot2创建的。我没有用他们的“元素编号”,而是以他们的起源(“数据集1/2”)为他们着色,因为我发现用这种方法来完成一个合适的图形更容易。

小数据集

使用geom_jitter来显示所有的情况下,避免overplotting。

# import hadleyverse 
library(magrittr) 
library(dplyr) 
library(tidyr) 
library(ggplot2) 

# generate small amount of data 
set.seed(1234) 
df1 <- data.frame(v1 = rnorm(5, 4, 1), 
        v2 = rnorm(5, 5, 1), 
        v3 = rnorm(5, 6, 1), 
        origin = rep(factor("df1", levels = c("df1", "df2")), 5)) 

df2 <- data.frame(v1 = rnorm(5, 4.5, 1), 
        v2 = rnorm(5, 5.5, 1), 
        v3 = rnorm(5, 6.5, 1), 
        origin = rep(factor("df2", levels = c("df1", "df2")), 5)) 

# merge dataframes and gather in long format 
pdata <- bind_rows(df1, df2) %>% 
    gather(id, variable, -origin) 

# plot data 
ggplot(pdata, aes(x = id, y = variable, fill = origin, colour = origin)) + 
    stat_summary(fun.y = mean, geom = "point", position = position_dodge(width = .5), 
       size = 30, shape = "-", show_guide = F, alpha = .7) + # plot mean as "-" 
    geom_jitter(position = position_jitterdodge(jitter.width = .3, jitter.height = .1, 
               dodge.width = .5), 
       size = 4, alpha = .85) + 
    labs(x = "Variable", y = NULL) + # adjust legend 
    theme_light() # nicer theme 

Jitter_Plot

“大” 数据集

如果您有更多的数据点,就可以使用geom_violin来概括他们。

set.seed(12345) 
df1 <- data.frame(v1 = rnorm(50, 4, 1), 
        v2 = rnorm(50, 5, 1), 
        v3 = rnorm(50, 6, 1), 
        origin = rep(factor("df1", levels = c("df1", "df2")), 50)) 

df2 <- data.frame(v1 = rnorm(50, 4.5, 1), 
        v2 = rnorm(50, 5.5, 1), 
        v3 = rnorm(50, 6.5, 1), 
        origin = rep(factor("df2", levels = c("df1", "df2")), 50)) 

# merge dataframes 
pdata <- bind_rows(df1, df2) %>% 
    gather(id, variable, -origin) 

# plot with violin plot 
ggplot(pdata, aes(x = id, y = variable, fill = origin)) + 
    geom_violin(adjust = .6) + 
    stat_summary(fun.y = mean, geom = "point", position = position_dodge(width = .9), 
       size = 6, shape = 4, show_guide = F) + 
    guides(fill = guide_legend(override.aes = list(colour = NULL))) + 
    labs(x = "Variable", y = NULL) + 
    theme_light() 

Violin_plot

版本均值和标绘与标准差的均值SD

如果你坚持,在这里是如何可以做到。

# merge dataframes and compute limits for sd 
pdata <- bind_rows(df1, df2) %>% 
    gather(id, variable, -origin) %>% 
    group_by(origin, id) %>%   # group data for limit calculation 
    mutate(upper = mean(variable) + sd(variable), # upper limit for error bar 
     lower = mean(variable) - sd(variable)) # lower limit for error bar 

# plot 
ggplot(pdata, aes(x = id, y = variable, fill = origin)) + 
    stat_summary(fun.y = mean, geom = "bar", position = position_dodge(width = .9), 
       size = 3) + 
    geom_errorbar(aes(ymin = lower, ymax = upper), 
       width = .2,     # Width of the error bars 
       position = position_dodge(.9)) 

Bar_Plot