2012-10-15 133 views
0

我有这个Gzip压缩的XML文件: http://cdon.com/xml_files/cdon_games_SE.xml.gzLxml无法解析gzipped XML?

根据LXML http://lxml.de/parsing.html LXML可以解析gzip压缩XML的文件: “LXML可以从本地文件,HTTP URL或FTP URL解析它也自动。检测并读取gzip压缩的XML文件(.gz)。“

此代码:

from lxml import etree 
tree = urllib.urlopen('http://cdon.com/xml_files/cdon_games_SE.xml.gz') 
parser = etree.XMLParser(recover=True) 
tree = etree.parse(tree, parser) 
tree = tree.xpath(//product) 

给出错误:

tree = tree.xpath(//product) 
    File "lxml.etree.pyx", line 2038, in lxml.etree._ElementTree.xpath (src/lxml\lxml.etree.c:47529) 
    File "lxml.etree.pyx", line 1709, in lxml.etree._ElementTree._assertHasRoot (src/lxml\lxml.etree.c:44508) 
AssertionError: ElementTree not initialized, missing root 

有什么不对?不能lxml解析gzipped XML文件?如果我将文件保存为xml(不带gzip)作为本地服务器上的文件,它可以工作。

+0

相同的确切问题。当我手动解压缩时,工作正常。 – clg4

回答

0

上面的URL返回正确的MIME类型。您是否尝试过下载文件并将其保存为.xml.gz以查看lxml在文件和请求句柄上的工作方式是否不同?