2013-04-12 119 views
0

我想从使用嵌套循环的此网站获取所有表格。我几乎在那里,但仍然不确定几个具有相同类标识符的表的循环。我得到一个错误代码为line 26 : for s in soup.findALL ("table", { "class" : "boxScore"})使用嵌套循环从beautifulsoup获取所有表格使用嵌套循环

SyntaxError: invalid syntax.

我的脚本:

import datetime 
import urllib 
from bs4 import BeautifulSoup 
import urllib2 


day = int(datetime.datetime.now().strftime("%d"))-1 

month = datetime.datetime.now().strftime("%B") 
year = datetime.datetime.now().strftime("%Y") 
file_name = "https://stackoverflow.com/users/ripple/NHL.csv" 
file = open(file_name,"w") 
url = "http://www.tsn.ca/nhl/scores/?date=" + month + "/" + str(day) + "/" + year 
print 'Grabbing from: ' + url + '...\n' 
try: 
     r = urllib2.urlopen(url) 
except urllib2.URLError as e: 
      r = e 
if r.code in (200, 401):  
    #get the table data from the page 
    data = urllib.urlopen(url).read() 
    #send to beautiful soup 
    soup = BeautifulSoup(data) 
    print soup 
    soup = soup.findALL ("table", { "class" : "boxScore"}) 
    for s in soup.findALL ("table", { "class" : "boxScore"}) 
     table = soup.find("table",{ "class" : "boxScore"}) 
     for tr in table.findAll('tr')[2:]: 
      col = tr.findAll('td') 
      team = col[0].get_text().encode('ascii','ignore').replace(" ","") 
      firstp = col[1].get_text().encode('ascii','ignore').replace(" ","") 
      secondp = col[2].get_text().encode('ascii','ignore').replace(" ","") 
      thirdp = col[3].get_text().encode('ascii','ignore').replace(" ","") 
      final = col[4].get_text().encode('ascii','ignore').replace(" ","") 
      record = team + ',' + final + '\n' 
      print record 
      file.write(record) 
else: 
    print str(i) + " NO GAMES" 
file.close() 

回答

2

在Python中循环使用冒号结束 ':'。另外:API方法是findAll()而不是findALL()。

+0

哇,我很感激!谢谢! –