2017-04-24 44 views
1

我正在将一组R可视化翻译成Python。我有以下的目标R的多个情节直方图:MatPlotlib Seaborn多重格式化

R Multiple Plot Histograms

使用Matplotlib和Seaborn结合,以一种StackOverflow的成员的帮助下(见链接:Python Seaborn Distplot Y value corresponding to a given X value),我是能够创建以下Python的阴谋:

enter image description here

我很满意它的外观,只是,我不知道如何把在该地块的报头信息。这里是我创建Python图表的Python代码

""" Program to draw the sampling histogram distributions """ 
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
from matplotlib.backends.backend_pdf import PdfPages 
import seaborn as sns 

def main(): 
    """ Main routine for the sampling histogram program """ 
    sns.set_style('whitegrid') 
    markers_list = ["s", "o", "*", "^", "+"] 
    # create the data dataframe as df_orig 
    df_orig = pd.read_csv('lab_samples.csv') 
    df_orig = df_orig.loc[df_orig.hra != -9999] 
    hra_list_unique = df_orig.hra.unique().tolist() 
    # create and subset df_hra_colors to match the actual hra colors in df_orig 
    df_hra_colors = pd.read_csv('hra_lookup.csv') 
    df_hra_colors['hex'] = np.vectorize(rgb_to_hex)(df_hra_colors['red'], df_hra_colors['green'], df_hra_colors['blue']) 
    df_hra_colors.drop(labels=['red', 'green', 'blue'], axis=1, inplace=True) 
    df_hra_colors = df_hra_colors.loc[df_hra_colors['hra'].isin(hra_list_unique)] 

    # hard coding the current_component to pc1 here, we will extend it by looping 
    # through the list of components 
    current_component = 'pc1' 
    num_tests = 5 
    df_columns = df_orig.columns.tolist() 
    start_index = 5 
    for test in range(num_tests): 
     current_tests_list = df_columns[start_index:(start_index + num_tests)] 
     # now create the sns distplots for each HRA color and overlay the tests 
     i = 1 
     for _, row in df_hra_colors.iterrows(): 
      plt.subplot(3, 3, i) 
      select_columns = ['hra', current_component] + current_tests_list 
      df_current_color = df_orig.loc[df_orig['hra'] == row['hra'], select_columns] 
      y_data = df_current_color.loc[df_current_color[current_component] != -9999, current_component] 
      axs = sns.distplot(y_data, color=row['hex'], 
           hist_kws={"ec":"k"}, 
           kde_kws={"color": "k", "lw": 0.5}) 
      data_x, data_y = axs.lines[0].get_data() 
      axs.text(0.0, 1.0, row['hra'], horizontalalignment="left", fontsize='x-small', 
        verticalalignment="top", transform=axs.transAxes) 
      for current_test_index, current_test in enumerate(current_tests_list): 
       # this_x defines the series of current_component(pc1,pc2,rhob) for this test 
       # indicated by 1, corresponding R program calls this test_vector 
       x_series = df_current_color.loc[df_current_color[current_test] == 1, current_component].tolist() 
       for this_x in x_series: 
        this_y = np.interp(this_x, data_x, data_y) 
        axs.plot([this_x], [this_y - current_test_index * 0.05], 
          markers_list[current_test_index], markersize = 3, color='black') 
      axs.xaxis.label.set_visible(False) 
      axs.xaxis.set_tick_params(labelsize=4) 
      axs.yaxis.set_tick_params(labelsize=4) 
      i = i + 1 
     start_index = start_index + num_tests 
    # plt.show() 
    pp = PdfPages('plots.pdf') 
    pp.savefig() 
    pp.close() 

def rgb_to_hex(red, green, blue): 
    """Return color as #rrggbb for the given color values.""" 
    return '#%02x%02x%02x' % (red, green, blue) 

if __name__ == "__main__": 
    main() 

熊猫代码工作正常,它正在做它应该做的。这是我缺乏在Matplotlib中使用'PdfPages'的知识和经验,这是瓶颈。如何在Python/Matplotlib/Seaborn中显示头文件信息,以便在相应的可视化中显示。通过标题信息,我的意思是什么R可视化具有在直方图之前的顶部,即'pc1',MRP,XRD ......

我可以很容易地从我的程序中获取它们的值,例如,current_component是'pc1'等,但我不知道如何使用Header格式化绘图。有人可以提供一些指导吗?

回答

0

你可能会寻找一个人物称号或超冠军,fig.suptitle

fig.suptitle('this is the figure title', fontsize=12) 

在你的情况,你可以很容易地与plt.gcf()的身影,所以尽量

plt.gcf().suptitle("pc1") 

的其余部分标题中的信息将被称为legend。 对于以下我们假设所有子图都有相同的标记。然后创建一个子图的图例就足够了。 要创建图例标签,你可以把label参数的情节,即

axs.plot(... , label="MRP") 

当稍后再打axs.legend()一个传奇将自动与相应的标签生成。放置图例的方法详述如下在this answer
在这里,您可能希望将图例置于图坐标上,即

ax.legend(loc="lower center",bbox_to_anchor=(0.5,0.8),bbox_transform=plt.gcf().transFigure)