2014-07-17 59 views
-1

我想为两个阶段做个人气泡图,最终目的是看看基因型是否在2个阶段获得相同的分数。所以,我想Stage_1在x轴和Stage_2在y轴Bubble Plot - R

我真的很喜欢this tutorial,但我不知道该怎么在圈

 Geno Stage_1 Stage_2 
Individual_1  9  8.1 
Individual_2  3.1  1 
Individual_3  4.1  2 
Individual_4  9  6.1 
Individual_5  2.9  1 
Individual_6  4.1  1.4 
Individual_7  4.4  1.5 
Individual_8  3  1 
Individual_9  3.1  1.3 
Individual_10  4.1  1.8 
Individual_11  8.3  4 
Individual_12  8.6  5.5 
Individual_13  9  5.3 
Individual_14  9  4.3 
Individual_15  7  2 
Individual_16  9  5.8 
Individual_17  9  6.4 
Individual_18  5.4  1.1 
Individual_19  5.8  2.3 
Individual_20  5.3  1.5 
Individual_21  9  6.8 
Individual_22  8  3.3 
Individual_23  8.1  7.6 
+0

我看不出气泡图会如何帮助您进行分析,在Stage1和Stage2中有一个简单的条形图将会提供更多的信息 – OdeToMyFiddle

回答

2

放置@Osssan是当场上。由于您希望看到跨不同元素的阶段进行比较(即您正在比较多个类别中的值),并且没有适当的泡沫图所必需的三个维度,所以这将是泡泡图的不恰当使用。即:

# NOTE: dput(VARIABLE) is a much better way to post data into SO posts: 

dat <- structure(list(Geno = structure(c(1L, 12L, 17L, 18L, 19L, 20L, 
        21L, 22L, 23L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L, 
        14L, 15L, 16L), .Label = c("Individual_1", "Individual_10", "Individual_11", 
        "Individual_12", "Individual_13", "Individual_14", "Individual_15", 
        "Individual_16", "Individual_17", "Individual_18", "Individual_19", 
        "Individual_2", "Individual_20", "Individual_21", "Individual_22", 
        "Individual_23", "Individual_3", "Individual_4", "Individual_5", 
        "Individual_6", "Individual_7", "Individual_8", "Individual_9" 
       ), class = "factor"), Stage_1 = c(9, 3.1, 4.1, 9, 2.9, 4.1, 4.4, 
        3, 3.1, 4.1, 8.3, 8.6, 9, 9, 7, 9, 9, 5.4, 5.8, 5.3, 9, 8, 8.1 
       ), Stage_2 = c(8.1, 1, 2, 6.1, 1, 1.4, 1.5, 1, 1.3, 1.8, 4, 5.5, 
        5.3, 4.3, 2, 5.8, 6.4, 1.1, 2.3, 1.5, 6.8, 3.3, 7.6)), .Names = c("Geno", 
        "Stage_1", "Stage_2"), class = "data.frame", row.names = c(NA, -23L)) 

# get difference between stages 

dat$diff = dat$Stage_2 - dat$Stage_1 

# simple barplot 

gg <- ggplot(dat, aes(x=reorder(Geno, dat$diff), y=dat$diff)) 
gg <- gg + geom_bar(stat="identity", width=0.25, fill="steelblue") 
gg <- gg + labs(x="", y="Genotype Stage 1/2 Diff", title="Genotype Stage Comparison") 
gg <- gg + coord_flip() 
gg <- gg + theme_bw() 
gg <- gg + theme(panel.border=element_blank()) 
gg <- gg + theme(panel.grid=element_blank()) 
gg 

enter image description here

# bubble plot 

dat$label <- gsub("Individual_", "", dat$Geno) 

gg <- ggplot(dat, aes(x=Stage_1, y=Stage_2)) 
gg <- gg + geom_point(aes(size=diff, color=Geno)) 
gg <- gg + geom_text(aes(label=label), size=4, hjust=1.5) 
gg <- gg + theme_bw() 
gg <- gg + theme(legend.position="none") 
gg 

enter image description here

这应该是很明显的是,条形图显示哪些基因型有级之间更直观地比气泡情节的至少差(一个能尝试更好地扩大气泡,但它仍然会使辨别/比较变得更加困难,并且不能很好地利用这种图表类型)。

+0

感谢您的回复。 我刚才展示了23个人,但在实际数据中我有超过300个。有办法将他们全部称为列表并将它们逐一分配。所以我想要一个代码谁可以采取所有的个人让“n”,然后做分析也geno的名称可以是个人1以外的一些可能只是章程(果冻,鱼等),字符/数字的其他组合(JF-JxF- 001等等..)。两个阶段的得分从1-9开始一样,甚至在1-8之间。 谢谢 – user3459293