1
所以我之前问过一个关于从html页面获取高分的问题,另一个用户给了我下面的代码来帮助。我是python和beautifulsoup的新手,所以我正在尝试通过其他一些代码一块一块地去做。据我所知大部分,但我不明白这是什么一段代码是什么,它的功能是:在Python中涉及urllib2和BeautifulSoup的这个函数是什么?
def parse_string(el):
text = ''.join(el.findAll(text=True))
return text.strip()
这里是整个代码:在一个元素内部
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import sys
URL = "http://hiscore.runescape.com/hiscorepersonal.ws?user1=" + sys.argv[1]
# Grab page html, create BeatifulSoup object
html = urlopen(URL).read()
soup = BeautifulSoup(html)
# Grab the <table id="mini_player"> element
scores = soup.find('table', {'id':'mini_player'})
# Get a list of all the <tr>s in the table, skip the header row
rows = scores.findAll('tr')[1:]
# Helper function to return concatenation of all character data in an element
def parse_string(el):
text = ''.join(el.findAll(text=True))
return text.strip()
for row in rows:
# Get all the text from the <td>s
data = map(parse_string, row.findAll('td'))
# Skip the first td, which is an image
data = data[1:]
# Do something with the data...
print data
使用反引号的HTML。 :) – 2009-06-14 02:18:33