2013-06-20 69 views
1

需要绘制无监督矩形SOM模型的结果。附加要求:1)将每个节点绘制为具有相应观察类别的饼图;图表的大小应反映节点中的样本数量。默认plot.kohonen不适合这种情况。无监督SOM可视化

回答

1

这是一个可能的解决方案。第一个函数som.prep.df由第二个'som.draw'调用,它只有两个参数SOM模型和观察到的训练集类。

som.prep.df <- function(som.model, obs.classes, scaled) { 
    require(reshape2) 
    lev <- factor(wine.classes) 
    df <- data.frame(cbind(unit=som.model$unit.classif, class=as.integer(lev))) 
    # create table 
    df2 <- data.frame(table(df)) 
    df2 <- dcast(df2, unit ~ class, value.var="Freq") 
    df2$unit <- as.integer(df2$unit) 
    # calc sum 
    df2$sum <- rowSums(df2[,-1]) 
    # calc fraction borders of classes in each node 
    tmp <- data.frame(cbind(X0=rep(0,nrow(df2)), 
          t(apply(df2[,-1], 1, function(x) { 
          cumsum(x[1:(length(x)-1)])/x[length(x)] 
          })))) 
    df2 <- cbind(df2, tmp) 
    df2 <- melt(df2, id.vars=which(!grepl("^\\d$", colnames(df2)))) 
    df2 <- df2[,-ncol(df2)] 
    # define border for each classs in each node 
    tmp <- t(apply(df2, 1, function(x) { 
    c(x[paste0("X", as.character(as.integer(x["variable"])-1))], 
     x[paste0("X", as.character(x["variable"]))]) 
    })) 
    tmp <- data.frame(tmp, stringsAsFactors=FALSE) 
    tmp <- sapply(tmp, as.numeric) 
    colnames(tmp) <- c("ymin", "ymax") 
    df2 <- cbind(df2, tmp) 
    # scale size of pie charts 
    if (is.logical(scaled)) { 
    if (scaled) { 
     df2$xmax <- log2(df2$sum) 
    } else { 
     df2$xmax <- df2$sum 
    } 
    } 
    df2 <- df2[,c("unit", "variable", "ymin", "ymax", "xmax")] 
    colnames(df2) <- c("unit", "class", "ymin", "ymax", "xmax") 
    # replace classes with original levels names 
    df2$class <- levels(lev)[df2$class] 
    return(df2) 
} 

som.draw <- function(som.model, obs.classes, scaled=FALSE) { 
    # scaled - make or not a logarithmic scaling of the size of each node 
    require(ggplot2) 
    require(grid) 
    g <- som.model$grid 
    df <- som.prep.df(som.model, obs.classes, scaled) 
    df <- cbind(g$pts, df[,-1]) 
    df$class <- factor(df$class) 
    g <- ggplot(df, aes(fill=class, ymax=ymax, ymin=ymin, xmax=xmax, xmin=0)) + 
    geom_rect() + 
    coord_polar(theta="y") + 
    facet_wrap(x~y, ncol=g$xdim, nrow=g$ydim) + 
    theme(axis.ticks = element_blank(), 
      axis.text.y = element_blank(), 
      axis.text.x = element_blank(), 
      panel.margin = unit(0, "cm"), 
      strip.background = element_blank(), 
      strip.text = element_blank(), 
      plot.margin = unit(c(0,0,0,0), "cm"), 
      panel.background = element_blank(), 
      panel.grid = element_blank()) 
    return(g) 
} 

用法示例。

require(kohonen) 
data(wines) 
som.wines <- som(scale(wines), grid = somgrid(5, 5, "rectangular")) 

# Non-scaled map 
som.draw(som.wines, wine.classes) 

enter image description here

# Scaled map 
som.draw(som.wines, wine.classes, TRUE) 

enter image description here

这种功能也可用于监督模型的可视化,以及。但它只适用于矩形地图。希望这会帮助某人。

有几种可能的改进:

  1. 选择比对数更好的缩放功能。因为现在具有单个样本的节点在缩放后变得不可见。
  2. 将图例添加到将反映节点大小的整个绘图。
  3. 或在每个图表上添加有关节点数量的信息。

PS。代码不是很优雅,所以任何建议和改进都是值得欢迎的。