2016-10-14 40 views
1

我有一个熊猫数据帧,看起来是这样的:Python3 - 遍历行,然后一些列打印文本

  0     1     2    3 \ 
0 UICEX_0001 path/to/bam_T.bam path/to/bam_N.bam  chr1:10000 
1 UICEX_0002 path/to/bam_T2.bam path/to/bam_N2.bam chr54:4958392 

       4 
0 chr4:4958392 
1   NaN 

我的每一行和打印文本输出另一个程序试图循环。我需要打印前三列(以及其他一些文本),然后再查看其余列,并根据它们是否为NaN打印不同的东西。

这个作品大多:

当前代码

def CreateIGVBatchScript(x): 
    for row in x.iterrows(): 
     print("\nnew") 
     sample = x[0] 
     bamt = x[1] 
     bamn = x[2] 
     print("\nload", bamt.to_string(index=False), "\nload", bamn.to_string(index=False)) 
     for col in range(3, len(x.columns)): 
      position = x[col] 
      if position.isnull().values.any(): 
       print("\n") 
      else: 
       position = position.to_string(index=False) 
       print("\ngoto ", position, "\ncollapse\nsnapshot ", sample.to_string(index=False), "_", position,".png\n") 

CreateIGVBatchScript(data) 

但输出看起来是这样的:

实际输出

new 
load path/to/bam_T.bam 
path/to/bam_T2.bam 
load path/to/bam_N.bam 
path/to/bam_N2.bam 

goto chr1:10000 
chr54:4958392 
collapse 
snapshot UICEX_0001 **<-- ISSUE: it's printing both rows at the same time** 
UICEX_0002 _ chr1:10000 
chr54:4958392 .png 

new 

load path/to/bam_T.bam 
path/to/bam_T2.bam 
load path/to/bam_N.bam 
path/to/bam_N2.bam 

goto chr1:10000 
chr54:4958392 
collapse 
snapshot UICEX_0001 **<-- ISSUE: it's printing both rows at the same time** 
UICEX_0002 _ chr1:10000 
chr54:4958392 .png 

的第一部分看起来不错,但是当我开始遍历列时,所有行都被打印出来。我似乎无法弄清楚如何解决这个问题。这是我希望这些部件之一的样子:

部分通缉输出

goto chr1:10000 
collapse 
snapshot UICEX_0001_chr1:10000.png 
goto chr54:4958392 
collapse 
snapshot UICEX_0001_chr54:495832.png 

的额外信息 顺便说一句,我其实想从R脚本,以适应这更好地学习Python。这里的R代码,以防万一:

CreateIGVBatchScript <- function(x){ 
    for(i in 1:nrow(x)){ 
      cat("\nnew") 
      sample = as.character(x[i, 1]) 
      bamt = as.character(x[i, 2]) 
      bamn = as.character(x[i, 3]) 
      cat("\nload",bamt,"\nload",bamn) 
      for(j in 4:ncol(x)){ 
       if(x[i, j] == "" | is.na(x[i, j])){ cat("\n") } 
       else{ 
        cat("\ngoto ", as.character(x[i, j]),"\ncollapse\nsnapshot ", sample, "_", x[i,j],".png\n", sep = "") 
       } 
      } 
    } 
    cat("\nexit") 
} 
CreateIGVBatchScript(data) 

回答

0

我已经想出了答案。这里有几个问题:

  1. 我错误地使用了iterrows()

iterrows对象实际上包含来自行的信息,然后可以使用索引来保存该系列中的值。

for index, row in x.iterrows(): 
    sample = row[0] 

将该行中保存的值在列0

  • 遍历列
  • 此时,可以使用一个简单的for循环,就像我在迭代列时一样。

    for col in range(3, len(data.columns)): 
        position = row[col] 
    

    允许您保存该列的值。

    最终Python代码是:

    def CreateIGVBatchScript(x): 
        x=x.fillna(value=999) 
        for index, row in x.iterrows(): 
         print("\nnew", sep="") 
         sample = row[0] 
         bamt = row[1] 
         bamn = row[2] 
         print("\nload ", bamt, "\nload ", bamn, sep="") 
         for col in range(3, len(data.columns)): 
          position = row[col] 
          if position == 999: 
           print("\n") 
          else: 
           print("\ngoto ", position, "\ncollapse\nsnapshot ", sample, "_", position, ".png\n", sep="") 
    
    CreateIGVBatchScript(data) 
    

    答案用下述帖子指导:

    相关问题