2017-04-21 121 views
1

我转换了SPSS代码转换成熊猫,我试图找到一个Python化的方式来表达这件事:大熊猫在列数

COUNT WBbf = M1 M26 M38 M50 M62 M74 M85 M97 M109 
     M121 M133 M144 (1). 

COUNT SPbf = M2 M15 M39 M51 M75 M87 M110 (1) 
      M63 M98 M122 M134 M145 (0). 

COUNT ACbf = M3 M16 M27 M52 M76 M88 M111 M123 M135 M146 (1) 
      M64 M99 (0). 

COUNT SCbf = M5 M17 M40 M77 M112 (1) 
      M28 M65 M89 M100 M124 M136 M148 (0). 

我的数据框有以下形式:

In [90]: data[b] 
Out[90]: 
           M1 M2 M3 M4 M5 M6 M7 M8 M9 \ 
case_id                  
ERAB_S1_LR_Q1_261016   1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_AS_011116    1.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0 0.0 
ERAB_S2_LR_Q1_021116AFTERNOO 1.0 1.0 1.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S2_AS031116MORNING  1.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S3_AS031116AFTERNOON  1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS041116    1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LOH__S3_021116   1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LR_081116    1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS_111116    1.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 
ERAB_S1_141116AFTERNOON  1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_LOH_151116   1.0 0.0 1.0 1.0 1.0 0.0 1.0 0.0 1.0 
ERAB_S1_161116    1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 

和等等... 我想计算值,并创建一个新的列与每个案件ID的结果。

回答

1

我相信你可以首先通过loc选择数据,通过eq比较,然后每行sumTrue值:

#add strings by your data 
SPbf1 = 'M2 M5 M8'.split() 
SPbf0 = 'M6 M9'.split() 
print (SPbf1) 
['M2', 'M5', 'M8'] 

print (SPbf0) 
['M6', 'M9'] 

df['SPbf'] = df[SPbf1].eq(1).sum(axis=1) + df[SPbf0].eq(0).sum(axis=1) 
print (df) 
           M1 M2 M3 M4 M5 M6 M7 M8 M9 \ 
case_id                  
ERAB_S1_LR_Q1_261016   1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_AS_011116    1.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0 0.0 
ERAB_S2_LR_Q1_021116AFTERNOO 1.0 1.0 1.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S2_AS031116MORNING  1.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S3_AS031116AFTERNOON  1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS041116    1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LOH__S3_021116   1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LR_081116    1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS_111116    1.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 
ERAB_S1_141116AFTERNOON  1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_LOH_151116   1.0 0.0 1.0 1.0 1.0 0.0 1.0 0.0 1.0 
ERAB_S1_161116    1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 

           SPbf 
case_id        
ERAB_S1_LR_Q1_261016    2 
ERAB_AS_011116     4 
ERAB_S2_LR_Q1_021116AFTERNOO  1 
ERAB_S2_AS031116MORNING   1 
ERAB_S3_AS031116AFTERNOON  1 
ERAB_S1_AS041116     1 
ERAB_LOH__S3_021116    2 
ERAB_LR_081116     2 
ERAB_S1_AS_111116    2 
ERAB_S1_141116AFTERNOON   2 
ERAB_S1_LOH_151116    2 
ERAB_S1_161116     2 

如果一些列名可能丢失,而不是loc使用reindex_axis

SPbf1 = 'M2 M15 M39 M51 M75 M87 M110'.split() 
SPbf0 = 'M63 M98 M122 M134 M145'.split() 
print (SPbf1) 
['M2', 'M15', 'M39', 'M51', 'M75', 'M87', 'M110'] 

print (SPbf0) 
['M63', 'M98', 'M122', 'M134', 'M145'] 

df['SPbf'] = df.reindex_axis(SPbf1, axis=1).eq(1).sum(axis=1) + \ 
      df.reindex_axis(SPbf0, axis=1).eq(0).sum(axis=1) 

print (df) 
           M1 M2 M3 M4 M5 M6 M7 M8 M9 \ 
case_id                  
ERAB_S1_LR_Q1_261016   1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_AS_011116    1.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0 0.0 
ERAB_S2_LR_Q1_021116AFTERNOO 1.0 1.0 1.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S2_AS031116MORNING  1.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 
ERAB_S3_AS031116AFTERNOON  1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS041116    1.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LOH__S3_021116   1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_LR_081116    1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_AS_111116    1.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 
ERAB_S1_141116AFTERNOON  1.0 1.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 
ERAB_S1_LOH_151116   1.0 0.0 1.0 1.0 1.0 0.0 1.0 0.0 1.0 
ERAB_S1_161116    1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 1.0 

           SPbf 
case_id        
ERAB_S1_LR_Q1_261016    1 
ERAB_AS_011116     1 
ERAB_S2_LR_Q1_021116AFTERNOO  1 
ERAB_S2_AS031116MORNING   1 
ERAB_S3_AS031116AFTERNOON  0 
ERAB_S1_AS041116     0 
ERAB_LOH__S3_021116    1 
ERAB_LR_081116     1 
ERAB_S1_AS_111116    1 
ERAB_S1_141116AFTERNOON   1 
ERAB_S1_LOH_151116    0 
ERAB_S1_161116     1 
+0

谢谢!它正在工作 –