我想下载从搜索结果下载第一个pdb文件(下载链接给出以下名称)。我使用蟒蛇,硒和美丽。直到现在我已经开发了代码。使用python beautifulsoup和硒下载文件
import urllib2
from BeautifulSoup import BeautifulSoup
from selenium import webdriver
uni_id = "P22216"
# set parameters
download_dir = "/home/home/Desktop/"
url = "http://www.rcsb.org/pdb/search/smart.do?smartComparator=and&smartSearchSubtype_0=UpAccessionIdQuery&target=Current&accessionIdList_0=%s" % uni_id
print "url - ", url
# opening the url
text = urllib2.urlopen(url).read();
#print "text : ", text
soup = BeautifulSoup(text);
#print soup
print
table = soup.find("table", {"class":"queryBlue"})
#print "table : ", table
status = 0
rows = table.findAll('tr')
for tr in rows:
try:
cols = tr.findAll('td')
if cols:
link = cols[1].find('a').get('href')
print "link : ", link
if link:
if status==1:
main_url = "http://www.rcsb.org" + link
print "main_url-----", main_url
status = False
browser.click(main_url)
status+=1
except:
pass
我正在变成无。
如何下载搜索列表中的第一个文件? (即2YGV在这种情况下)
Download link is : /pdb/protein/P32447
为我工作。获取'/pdb/explore/explore.do?structureId = 2YGV'。什么问题?你不能下载它? – ton1c
我也有,但如何下载该文件。 dat我的问题 – sam