2014-07-08 67 views
2

下图是一个'伪'甘特图,旨在描述某些战争的持续时间(X轴)以及数量伤亡。如果你能帮我解决我面临的两个问题,我将不胜感激。ggplot/geom_segment:根据变量的值对y轴标签(因子)进行排序

1)我想根据每个国家(facet WarLocationCountry)内每个战争(geom_segment)的开始日期(war.start)对y轴标签(变量/因素WarName)进行排序。我希望最早开始的战争在y轴之上,例如对于苏丹来说,排序应该是:第一个南苏丹,第二个南苏丹,苏丹解放军师,达尔富尔。

我认为它是与scale_y_discrete(REV(水平(CoW.tmp $ WarLocationCountry)),但我无法弄清楚如何使它依赖CoW.tmp $ war.start。

2)geom_text在geom_segments旁边添加估计的伤亡人数(sum.deaths; numeric);这些估计包括几个NA /缺失数据。每当我将它们保存为NA时,我会得到错误消息:错误:'x'和'单位'的长度必须> 0;我想通过将na.rm = TRUE添加到geom_text部分,这将得到解决,但不幸的是,情况并非如此。

当前丢失的数据编码为0. CoW.tmp $ sum.deaths [CoW.tmp $ sum.deaths == 0] < - NA在运行ggplot代码时导致错误。

对不起,没有以更一般的方式来制定这个问题。非常感谢任何提示。

enter image description here

代码图表:

CoW.plot <- ggplot(CoW.tmp) + 
    geom_segment(aes(color=WarType, x=war.start, xend=war.end, y=WarName, yend=WarName), size=1) + 
    geom_point(aes(shape=Outcome2, color=WarType, x=war.end,y=WarName), size=3)+ 
    geom_point(aes(shape=WarType, color=WarType, x=war.start,y=WarName), size=3)+ 
    theme(plot.title=element_text(face="bold"), 
     legend.position="bottom", 
     legend.title=element_text(size=7), 
     legend.text=element_text(size=5), 
     legend.box="horizontal", 
     axis.title.x = element_blank(), 
     axis.text.x = element_text(size=5), 
     axis.title.y = element_blank(), 
     axis.text.y = element_text(size=5, face="bold"))+ 
    scale_color_discrete(name="War Type:", 
         breaks=c("4","5","6","7"), 
         labels=c("central control","local issues","regional internal","intercommunal"))+ 
    scale_shape_manual(values=c(1,3,4,5,6,7), name="Outcome:", 
         breaks=c("1","3","4","5","6","7"), 
         labels=c("victory", "compromise","transformed type of war","ongoing","stalemate","continues below war threshold"))+ 
    geom_text(aes(x=as.Date(conflict.end+1500), y=WarName, label=sum.deaths), size=2, na.rm=TRUE)+ 
    scale_x_date(limits = c(as.Date("1946-01-01"), as.Date("2010-01-01")))+ 
    ggtitle(paste("INTRA-STATE CONFLICTS (CoW)",a,"\n"))+ 
    facet_wrap(~WarLocationCountry, scales="free_y", ncol=1) 

数据:

CoW.tmp<-structure(list(conflict.end = structure(c(788, -2178, -1310, 
3648, 5921, 6569, 12793, 12793, 6496, 8881, 7695, 9609, 8354, 
9876, 9876, 9876, 9876, 9876, 9876, 9876, 11271, 11271, 11271, 
11271, 11271, 11271, 11271, 11271, 11271, 13493, 14041, 14041, 
14041, 14041), class = "Date"), WarType = structure(c(2L, 1L, 
2L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L 
), .Label = c("4", "5", "7"), class = "factor"), war.start = structure(c(-2284, 
-2181, -1319, 1092, 3994, 4762, 5068, 8140, 6070, 6562, 6720, 
7751, 7909, 8382, 7988, 8382, 8382, 8382, 8382, 8382, 10263, 
10263, 10263, 10263, 11085, 11088, 11088, 11088, 11088, 12109, 
13520, 13213, 13430, 13440), class = "Date"), war.end = structure(c(788, 
-2178, -1310, 3648, 5921, 6569, 7908, 12793, 6496, 8881, 7695, 
9609, 8354, 9190, 9876, 9190, 9190, 9190, 8849, 9190, 10779, 
10779, 10779, 10779, 11271, 11271, 11271, 11271, 11271, 13493, 
13667, 14031, 14041, 14041), class = "Date"), WarName = c("First South Sudan", 
"Zanzibar Arab-African", "First Uganda", "Rhodesia", "Second Uganda", 
"Matabeleland", "Second South Sudan", "Second South Sudan", "Holy Spirit Movement", 
"Inkatha-ANC", "First Somalia", "First Sierra Leone", "The SPLA Division (Dinka-Nuer) War", 
"Second Somalia", "Second Somalia", "Second Somalia", "Second Somalia", 
"Second Somalia", "Second Somalia", "Second Somalia", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Second Sierra Leone", 
"Second Sierra Leone", "Second Sierra Leone", "Darfur", "Third Somalia", 
"Third Somalia", "Third Somalia", "Third Somalia"), Outcome2 = structure(c(3L, 
1L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 6L, 1L, 1L, 7L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 6L, 1L, 1L, 1L, 
1L), .Label = c("1", "2", "3", "4", "5", "6", "7"), class = "factor"), 
    sum.deaths = c("0", "0", "0", "11000", "46000", "0", "0", 
    "0", "7000", "0", "0", "0", "0", "70", "70", "70", "70", 
    "70", "70", "70", "0", "0", "0", "0", "0", "0", "0", "0", 
    "0", "0", "0", "0", "0", "0"), WarLocationCountry = structure(c(4L, 
    6L, 5L, 7L, 5L, 7L, 4L, 4L, 5L, 3L, 2L, 1L, 4L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 2L, 
    2L, 2L, 2L), .Label = c("Sierra Leone", "Somalia", "South Africa", 
    "Sudan", "Uganda", "Zanzibar", "Zimbabwe"), class = "factor")), .Names = c("conflict.end", 
"WarType", "war.start", "war.end", "WarName", "Outcome2", "sum.deaths", 
"WarLocationCountry"), class = "data.frame", row.names = c(34L, 
39L, 44L, 67L, 114L, 120L, 127L, 128L, 134L, 136L, 138L, 152L, 
155L, 157L, 158L, 159L, 160L, 161L, 162L, 163L, 197L, 198L, 199L, 
200L, 201L, 202L, 203L, 204L, 205L, 237L, 246L, 247L, 248L, 249L 
)) 
+1

对于排序水平(你的(1)),不要试图在'ggplot'内部执行。它只是查看因素“层次”的顺序。我认为'reorder()'是编辑基于另一个变量的最简单的方法。有关更多信息,请参阅'?reorder'或[此问题](http://stackoverflow.com/q/2375587/903061)。 – Gregor

+1

@zoowalk,你可以检查[** here **](http://stackoverflow.com/questions/16622979/reorder-not-correctly-reordering-a-factor-variable-in-ggplot)和[** here **](http://stackoverflow.com/questions/18816024/how-to-show-bars-in-ggplot2-in-descending-order-of-a-numeric-vector/18816504#18816504)上的例子在'aes'调用中使用'reorder'。 – Henrik

+0

+1用于包含您的数据集。你的例子没有运行 - 'gtitle(...)'中的变量'a'没有被定义。 – jlhoward

回答

3

像这样??

library(ggplot2) 
CoW.tmp <- with(CoW.tmp,CoW.tmp[order(WarLocationCountry,-as.integer(war.start)),]) 
CoW.tmp$WarName <- with(CoW.tmp,factor(WarName,levels=unique(WarName))) 
ggplot(CoW.tmp) + 
    geom_segment(aes(color=WarType, x=war.start, xend=war.end, y=WarName, yend=WarName), size=1) + 
    geom_point(aes(shape=Outcome2, color=WarType, x=war.end,y=WarName), size=3)+ 
    geom_point(aes(shape=WarType, color=WarType, x=war.start,y=WarName), size=3)+ 
    theme(plot.title=element_text(face="bold"), 
     legend.position="bottom", 
     legend.title=element_text(size=7), 
     legend.text=element_text(size=5), 
     legend.box="vertical", 
     axis.title.x = element_blank(), 
     axis.text.x = element_text(size=10), 
     axis.title.y = element_blank(), 
     axis.text.y = element_text(size=10, face="bold"))+ 
    scale_color_discrete(name="War Type:", 
         breaks=c("4","5","6","7"), 
         labels=c("central control","local issues","regional internal","intercommunal"))+ 
    scale_shape_manual(values=c(1,3,4,5,6,7), name="Outcome:", 
        breaks=c("1","3","4","5","6","7"), 
        labels=c("victory", "compromise","transformed type of war","ongoing","stalemate","continues below war threshold"))+ 
    geom_text(aes(x=as.Date(conflict.end+1500), y=WarName, label=ifelse(sum.deaths!=0,sum.deaths,"")), size=3, na.rm=TRUE)+ 
    scale_x_date(limits = c(as.Date("1946-01-01"), as.Date("2010-01-01")))+ 
    ggtitle(paste("INTRA-STATE CONFLICTS (CoW)","","\n"))+ 
    facet_wrap(~WarLocationCountry, scales="free_y", ncol=1) 

你的第一个问题,与y轴的顺序,是有点更微妙比评论建议。你需要按照开始日期(国家)的反向顺序进行战争。我认为最简单的方法是按国家和开始日期重新排序整个数据框CoW.tmp,然后将WarName因子的级别重置为该顺序(前两行代码)。在调用order(...)函数时,您不能使用-war.start,因为一元减号不适用于日期,所以我们必须使用-as.integer(war.start)。这将返回一个整数,表示自1970-01-01以来的天数,我们可以反转。

即使这只是一个部分的解决方案。在你的数据集中,似乎有多个重复的记录(第二次索马里战争似乎多次出现,其他几次也是如此)。这就造成了苏丹人民解放军战争在苏丹第二次苏丹战争初期和二次革命之前开始的问题。这就是为什么在这种情况下y轴的排序不正确。

你的第二个问题,关于标签,解决了我在上面的评论。

请注意,我调整了字体大小,并设置legend.box="vertical"只是为了在SO上显示良好的图像。如果您要导出为pdf或其他格式,您需要将其改回。

+0

优秀。非常感谢所有的努力。 – zoowalker

相关问题