2014-03-25 41 views
0

我刚刚开始使用Python和BioPython,没有太多的编程经验。我会很感激你们可以给我的任何帮助。SeqIO:“在句柄中找不到记录”

我试图从genbank中提取CDS和/或rRNA序列。重要的是我只能获得开放阅读框架,这就是为什么我不只是拉动整个序列。当我运行下面的代码,它踢回一个错误说:record = SeqIO.read(handle, "genbank")

手柄

发现的代码行读取任何记录。我不知道如何解决这个问题。我已经包含了我在下面使用的代码。另外,如果有更简单的方法来执行此操作或已发布的代码,如果你们让我知道,我将不胜感激。

谢谢!

# search sequences by a combination of keywords 
# need to find (number of) results to set 'retmax' value 
handle = Entrez.esearch(db = searchdb, term = searchterm) 
records = Entrez.read(handle) 
handle.close() 
# repeat search with appropriate 'retmax' value 
all_handle = Entrez.esearch(db = searchdb, term = searchterm, retmax = records['Count']) 
records = Entrez.read(all_handle) 

print " " 
print "Number of sequences found:", records['Count'] #printing to make sure that code is working thus far. 
print " " 

locations = [] # store locations of target sequences 
sequences = [] # store target sequences 

for i in range(0,int(records['Count'])) : 
    handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "xml") 
    record = SeqIO.read(handle, "genbank") 
    for feature in record.features: 
     if feature.type==searchfeaturetype: #searches features for proper feature type 
      if searchgeneproduct in feature.qualifiers['product'][0]: #searches features for proper gene product 
       if str(feature.qualifiers) not in locations: # no repeat location entries 
        locations.append(str(feature.location)) # appends location entry 
        sequences.append(feature.extract(record.seq)) # append sequence 
+0

谢谢GWW。这解决了我的问题! – jrp355

回答

1

您请求从GenBank中xmlSeqIO.read预计格式是基因库平面文件格式。尝试将您的efetch行更改为:

handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "txt")