我想知道如何为矩阵相关性热图添加另一层重要和所需的复杂性,例如重要性级别除了R2值之外的方式之后的p值(-1至1)?
在这个问题中,并没有将这个问题的重要性级别的星号或p值作为文本显示在矩阵的每个平方上,而是在图像的每个平方上显示出显着性水平上的显着性水平矩阵。我认为只有那些喜欢创新思维的人才能赢得掌声,以解开这种解决方案,以便有最好的方式来表达复杂度的增加部分,以达到我们的“半真相矩阵相关热图”。我GOOGLE了很多,但从来没有见过一个正确的,或者我会说一个“眼睛友好”的方式来表示显着性水平加上反映R系数的标准色彩阴影。
可再生的数据集在这里找到:
http://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/
将R代码,请在下面找到:使用ggplot2添加到矩阵相关热图中的显着性水平
library(ggplot2)
library(plyr) # might be not needed here anyway it is a must-have package I think in R
library(reshape2) # to "melt" your dataset
library (scales) # it has a "rescale" function which is needed in heatmaps
library(RColorBrewer) # for convenience of heatmap colors, it reflects your mood sometimes
nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv")
nba <- as.data.frame(cor(nba[2:ncol(nba)])) # convert the matrix correlations to a dataframe
nba.m <- data.frame(row=rownames(nba),nba) # create a column called "row"
rownames(nba) <- NULL #get rid of row names
nba <- melt(nba)
nba.m$value<-cut(nba.m$value,breaks=c(-1,-0.75,-0.5,-0.25,0,0.25,0.5,0.75,1),include.lowest=TRUE,label=c("(-0.75,-1)","(-0.5,-0.75)","(-0.25,-0.5)","(0,-0.25)","(0,0.25)","(0.25,0.5)","(0.5,0.75)","(0.75,1)")) # this can be customized to put the correlations in categories using the "cut" function with appropriate labels to show them in the legend, this column now would be discrete and not continuous
nba.m$row <- factor(nba.m$row, levels=rev(unique(as.character(nba.m$variable)))) # reorder the "row" column which would be used as the x axis in the plot after converting it to a factor and ordered now
#now plotting
ggplot(nba.m, aes(row, variable)) +
geom_tile(aes(fill=value),colour="black") +
scale_fill_brewer(palette = "RdYlGn",name="Correlation") # here comes the RColorBrewer package, now if you ask me why did you choose this palette colour I would say look at your battery charge indicator of your mobile for example your shaver, won't be red when gets low? and back to green when charged? This was the inspiration to choose this colour set.
矩阵相关热图应该是这样的:
提示和想法,以增强解决方案:
- 此代码可能对从此网站获取的重要级别星级有所了解:
http://ohiodata.blogspot.de/2012/06/correlation-tables-in-r-flagged-with.html
R代码里面:
mystars <- ifelse(p < .001, "***", ifelse(p < .01, "** ", ifelse(p < .05, "* ", " "))) # so 4 categories
- 显着性水平可以作为色强度像阿尔法美学每平方米,但我不认为这将是很容易理解和捕捉
- 另一个想法将有4个不同尺寸的正方形对应于恒星,当然,如果最高恒星的尺寸最小,则增加到全尺寸的正方形
- 在这些重要的正方形内包含圆的另一个想法,圆的线对应于重要程度(剩余的3个类别)所有这些都是一种颜色
- 与上面相同,但是固定线条粗细,同时为其余3个显着水平提供3种颜色
- 您可能想出更好的想法,谁知道?
您的代码启发了我用ggplot2重写'arm :: corrplot'函数:http:// rpubs.com/briatte/ggcorr –
它很棒!请您扩展此功能以使这些非显着相关性(例如<0.05)消失,同时保持这些相等或更高。在这里,一个应该用另一个矩阵BUT给出函数的p值,我与你共享这个函数,它可以帮助获得该矩阵(你可以使用:cor.prob.all()cor.prob.all < - 函数(X,dfr = nrow(X)-2)R <-cor(X,use =“pairwise.complete.obs”,method =“spearman”) r2 < - R^2 Fstat < - r2 * dfr /(1-r2) R < - 1 - pf(Fstat,1,dfr) R [row(R)== col(R)] < - NA – doctorate
}我对这里(和其他地方)使用$ p $ -values持怀疑态度,但我会试着找出一些标记无关紧要的系数。 –