通过具有多个值的列对DF进行排序

在我的主df中，我有一列与其他两列组合在一起，创建的值如下所示：A1_43567_1。第一个数字代表一种评估方式，第二个数字代表问题ID，最后一个数字代表评估的问题位置。我计划创建一个数据透视表，将每个唯一值作为一列来查看多个学生对每个项目的选择。但我希望枢轴的顺序是问题位置，或串联中的第三个值。本质上，这输出：通过具有多个值的列对DF进行排序

Student ID A1_45678_1 A1_34551_2 A1_11134_3 etc.... 
    12345   1   0   0  
    12346   0   0   1 
    12343   1   1   0

我试着用原来的专栏中，我希望它由（问题上的立场）进行排序，然后创建数据透视表排序我的数据帧，但这并不导致上述结果我在找。有没有办法按列中的第三个值对原始串联值进行排序？或者是否有可能按每列中的第三个值对数据透视表进行排序？

当前的代码：

demo_pivot.sort(['Question Position'], ascending=True) 

    demo_pivot['newcol'] = 'A' + str(interim_selection) + '_' + ,\ 
    demo_pivot['Item ID'].map(str) + "_" + demo_pivot['Question Position'].map(str) 

    demo_pivot= pd.pivot_table(demo_pivot, index='Student ANET ID',values='Points Received',\ 
    columns='newcol').reset_index()

但是生成的输出：

Student ID A1_45678_1 A1_34871_7 A1_11134_15 etc.... 
    12345   1   0   0  
    12346   0   0   1 
    12343   1   1   0

来源

2015-09-17 krisko08

到pd.pivot_table()调用返回一个数据帧，正确吗？如果是这样，你可以重新排序生成的DataFrame的列吗？例如：

def sort_columns(column_list): 
    # Create a list of tuples: (question position, column name) 
    sort_list = [(int(col.split('_')[2]), col) for col in column_list] 

    # Sorts by the first item in each tuple, which is the question position 
    sort_list.sort() 

    # Return the column names in the sorted order: 
    return [x[1] for x in sort_list] 

# Now, you should be able to reorder the DataFrame like so: 
demo_pivot = demo_pivot.loc[:, sort_columns(demo_pivot.columns)]

来源

2015-09-17 17:25:29

最后，您可以使用'demo_pivot = demo_pivot [sort_columns（demo_pivot.columns）]来代替'.loc'' – Alexander

通过具有多个值的列对DF进行排序

回答

相关问题