3
我想从以下wikipedia page检索3列(NFL团队,玩家姓名,大学团队)。我是python的新手,一直在尝试使用beautifulsoup来完成这个任务。我只需要属于QB的列,但我甚至无法获得所有列,尽管位置。这是我迄今为止所做的,它什么都不输出,我不完全确定为什么。我相信这是由于一个标签,但我不知道要改变什么。任何帮助将不胜感激。'Wikipedia使用Python刮脸
wiki = "http://en.wikipedia.org/wiki/2008_NFL_draft"
header = {'User-Agent': 'Mozilla/5.0'} #Needed to prevent 403 error on Wikipedia
req = urllib2.Request(wiki,headers=header)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
rnd = ""
pick = ""
NFL = ""
player = ""
pos = ""
college = ""
conf = ""
notes = ""
table = soup.find("table", { "class" : "wikitable sortable" })
#print table
#output = open('output.csv','w')
for row in table.findAll("tr"):
cells = row.findAll("href")
print "---"
print cells.text
print "---"
#For each "tr", assign each "td" to a variable.
#if len(cells) > 1:
#NFL = cells[1].find(text=True)
#player = cells[2].find(text = True)
#pos = cells[3].find(text=True)
#college = cells[4].find(text=True)
#write_to_file = player + " " + NFL + " " + college + " " + pos
#print write_to_file
#output.write(write_to_file)
#output.close()
我知道它有很多评论它,因为我试图找到故障是在哪里。