Python XMLParser：什么时候是data（）方法调用

我在学习Python，并且对XML解析器（ElementTree-XMLParser）行为有一些困难的理解。Python XMLParser：什么时候是data（）方法调用

我修改的例子在documentation

class MaxDepth:      # The target object of the parser 
    path = "" 
    def start(self, tag, attrib): # Called for each opening tag. 
     self.path += "/"+ tag 
     print '>>> Entering - ' + self.path 
    def end(self, tag):    # Called for each closing tag. 
     print '<<< Leaving - ' + self.path 
     if self.path.endswith('/'+tag): 
      self.path = self.path[:-(len(tag)+1)] 
    def data(self, data): 
     if data: 
      print '... data called ...' 
      print data , 'length -' , len(data) 
    def close(self): # Called when all data has been parsed. 
     return self

它输出下面输出

>>> Entering - /a 
... data called ... 

length - 1 
... data called ... 
    length - 2 
>>> Entering - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 2 
<<< Leaving - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 2 
>>> Entering - /a/b 
... data called ... 

length - 1 
... data called ... 
    length - 4 
>>> Entering - /a/b/c 
... data called ... 

length - 1 
... data called ... 
     length - 6 
>>> Entering - /a/b/c/d 
... data called ... 

length - 1 
... data called ... 
     length - 6 
<<< Leaving - /a/b/c/d 
... data called ... 

length - 1 
... data called ... 
    length - 4 
<<< Leaving - /a/b/c 
... data called ... 

length - 1 
... data called ... 
    length - 2 
<<< Leaving - /a/b 
... data called ... 

length - 1 
<<< Leaving - /a 
<__main__.MaxDepth instance at 0x10e7dd5a8>

我的问题是

当是（）方法调用的数据。
为什么在开始标记之前调用两次
我无法找到api文档以获取有关data方法的更多详细信息。我在哪里可以找到类似XMLParser类的api参考javadoc。

来源

2012-06-11 bsr

如果您的使用不需要事件解析，则使用'.parse（）'http://www.doughellmann.com/PyMOTW/xml/etree/ElementTree/parse.html更容易。否则，他的事件示例可能会有所帮助：http://www.doughellmann.com/PyMOTW/xml/etree/ElementTree/parse.html#watching-events-while-parsing – ninMonkey

如果你要修改数据的方法，像这样：

def data(self, data): 
    if data: 
     print '... data called ...' 
     print repr(data), 'length -' , len(data)

，你就会明白为什么有对数据的方法多次调用;它被称为为标签之间的文本每一行数据：

>>> Entering - /a 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
>>> Entering - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
<<< Leaving - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 2 
>>> Entering - /a/b 
... data called ... 
'\n' length - 1 
... data called ... 
' ' length - 4 
# ... etc ...

的XMLParser的方法是基于Expat解析器。

根据我的经验，任何流式XML解析器都会将文本数据视为一系列块，并且必须将任何和所有数据事件连接在一起，直到您触及下一个starttag或endtag事件。解析器经常在空白边界处分块，但这不是给定的。

来源

2012-06-11 16:05:08

Python XMLParser：什么时候是data（）方法调用

回答

相关问题