0
如何才能找到所有div和span标签的顺序保存。使用BeautifulSoup它非常简单:soup.findAll(name=['span', 'div'])
,但我最近切换到lxml,因为它比BeautifulSoup快得多。lxml findall div和span标签
如何才能找到所有div和span标签的顺序保存。使用BeautifulSoup它非常简单:soup.findAll(name=['span', 'div'])
,但我最近切换到lxml,因为它比BeautifulSoup快得多。lxml findall div和span标签
import lxml.html
from lxml.cssselect import CSSSelector
content = result.read()
page_html = lxml.html.fromstring(content)
elements = page_html.xpath('//*[self::div or self::span]')
或
sd_selector = CSSSelector('span,div')
elements = sd_selector(page_html)
import lxml.html as LH
content = '''\
<tr>
<div>idend</div>
<span>Green<\span>
<tr>
'''
root = LH.fromstring(content)
for tag in root.xpath('//*[self::div or self::span]'):
print(tag)
产生
<Element div at 0xb751f23c>
<Element span at 0xb751f11c>
谢谢你,这并在trick.Which方法更快?我假设第一个。 – vericule 2013-03-15 16:15:35