Python的大熊猫多列组合到单个列

我有一个像下面的一个Python熊猫数据帧：Python的大熊猫多列组合到单个列

movie  unknown action adventure animation fantasy horror romance sci-fi 

Toy Story 0  1  1   0  1  0  0  1    
Golden Eye 0  1  0   0  0  0  1  0  
Four Rooms 1  0  0   0  0  0  0  0  
Get Shorty 0  0  0   1  1  0  1  0 
Copy Cat  0  0  1   0  0  1  0  0

我想这部电影流派结合成一个烧毛列。输出会是这样：

movie  genre 

Toy Story action, adventure, fantasy, sci-fy 
Golden Eye action, romance 
Four Rooms unknown 
Get Shorty animation, fantasy, romance 
Copy Cat adventure, horror

来源

2017-05-26 raja

你可以这样来做：

In [171]: df['genre'] = df.iloc[:, 1:].apply(lambda x: df.iloc[:, 1:].columns[x.astype(bool)].tolist(), axis=1) 

In [172]: df 
Out[172]: 
     movie unknown action adventure animation fantasy horror romance sci-fi         genre 
0 Toy Story  0  1   1   0  1  0  0  1 [action, adventure, fantasy, sci-fi] 
1 Golden Eye  0  1   0   0  0  0  1  0      [action, romance] 
2 Four Rooms  1  0   0   0  0  0  0  0        [unknown] 
3 Get Shorty  0  0   0   1  1  0  1  0   [animation, fantasy, romance] 
4 Copy Cat  0  0   1   0  0  1  0  0     [adventure, horror]

PS，但我不明白它如何能够帮助你，我没有看到任何好处相比“一个热点编码矩阵

来源

2017-05-26 15:33:29 MaxU

'df ['genre'] = df.apply（lambda x：df.columns [x.astype（bool）]。tolist（）[1：]，axis = 1）+1并同意它不会提供任何额外的好处 – bernie

@bernie，谢谢:) – MaxU

Python的大熊猫多列组合到单个列

回答

相关问题