2011-11-04 48 views
2

我想比较包含html的python unittest中的两个字符串。对于HTML字符串的漂亮打印assertEqual()

是否有一种方法在人类友好(差异)版本中输出结果?

+1

自1.4版以来,Django具有assertHTMLEqual:http://docs.djangoproject.com/en/dev/topics/testing/#django.test.SimpleTestCase.assertHTMLEqual – guettli

回答

0

我(一个问这个问题)使用BeautfulSoup现在:

def assertEqualHTML(string1, string2, file1='', file2=''): 
    u''' 
    Compare two unicode strings containing HTML. 
    A human friendly diff goes to logging.error() if there 
    are not equal, and an exception gets raised. 
    ''' 
    from BeautifulSoup import BeautifulSoup as bs 
    import difflib 
    def short(mystr): 
     max=20 
     if len(mystr)>max: 
      return mystr[:max] 
     return mystr 
    p=[] 
    for mystr, file in [(string1, file1), (string2, file2)]: 
     if not isinstance(mystr, unicode): 
      raise Exception(u'string ist not unicode: %r %s' % (short(mystr), file)) 
     soup=bs(mystr) 
     pretty=soup.prettify() 
     p.append(pretty) 
    if p[0]!=p[1]: 
     for line in difflib.unified_diff(p[0].splitlines(), p[1].splitlines(), fromfile=file1, tofile=file2): 
      logging.error(line) 
     raise Exception('Not equal %s %s' % (file1, file2)) 
1

也许这是一个相当“冗长”的解决方案。你可以添加一个新的“平等功能”为您的用户定义类型(例如:HTMLString),您必须首先定义:

class HTMLString(str): 
    pass 

现在你必须定义一个类型相等功能:

def assertHTMLStringEqual(first, second): 
    if first != second: 
     message = ... # TODO here: format your message, e.g a diff 
     raise AssertionError(message) 

你所要做的就是根据你的喜好格式化你的信息。您也可以在您的特定TestCase中使用类方法作为类型相等函数。这给你更多的功能来格式化你的信息,因为unittest.TestCase做了很多。

现在,你有你的unittest.TestCase注册这种平等功能:

... 
def __init__(self): 
    self.addTypeEqualityFunc(HTMLString, assertHTMLStringEqual) 

同为一类方法:

... 
def __init__(self): 
    self.addTypeEqualityFunc(HTMLString, 'assertHTMLStringEqual') 

现在你可以在你的测试中使用它:

def test_something(self): 
    htmlstring1 = HTMLString(...) 
    htmlstring2 = HTMLString(...) 
    self.assertEqual(htmlstring1, htmlstring2) 

这应该适用于python 2.7。

2

几年前,我提交了一个补丁来做到这一点。该补丁已被拒绝,但您仍然可以在python bug list上查看它。

我怀疑你是否想破解你的unittest.py来应用这个补丁(如果它在所有这段时间后仍然有效),但是这里的功能是将两个字符串减少到一个可管理的大小,同时仍然至少保留部分内容不同。只要你不想完全不同这威力是你想要什么:

def shortdiff(x,y): 
    '''shortdiff(x,y) 

    Compare strings x and y and display differences. 
    If the strings are too long, shorten them to fit 
    in one line, while still keeping at least some difference. 
    ''' 
    import difflib 
    LINELEN = 79 
    def limit(s): 
     if len(s) > LINELEN: 
      return s[:LINELEN-3] + '...' 
     return s 

    def firstdiff(s, t): 
     span = 1000 
     for pos in range(0, max(len(s), len(t)), span): 
      if s[pos:pos+span] != t[pos:pos+span]: 
       for index in range(pos, pos+span): 
        if s[index:index+1] != t[index:index+1]: 
         return index 

    left = LINELEN/4 
    index = firstdiff(x, y) 
    if index > left + 7: 
     x = x[:left] + '...' + x[index-4:index+LINELEN] 
     y = y[:left] + '...' + y[index-4:index+LINELEN] 
    else: 
     x, y = x[:LINELEN+1], y[:LINELEN+1] 
     left = 0 

    cruncher = difflib.SequenceMatcher(None) 
    xtags = ytags = "" 
    cruncher.set_seqs(x, y) 
    editchars = { 'replace': ('^', '^'), 
        'delete': ('-', ''), 
        'insert': ('', '+'), 
        'equal': (' ',' ') } 
    for tag, xi1, xi2, yj1, yj2 in cruncher.get_opcodes(): 
     lx, ly = xi2 - xi1, yj2 - yj1 
     edits = editchars[tag] 
     xtags += edits[0] * lx 
     ytags += edits[1] * ly 

    # Include ellipsis in edits line. 
    if left: 
     xtags = xtags[:left] + '...' + xtags[left+3:] 
     ytags = ytags[:left] + '...' + ytags[left+3:] 

    diffs = [ x, xtags, y, ytags ] 
    if max([len(s) for s in diffs]) < LINELEN: 
     return '\n'.join(diffs) 

    diffs = [ limit(s) for s in diffs ] 
    return '\n'.join(diffs) 
2

的简单方法是从HTML剥离空白并将其分割成一个列表。 Python 2.7's unittest(或backported unittest2)然后给出列表之间的人类可读的差异。

import re 

def split_html(html): 
    return re.split(r'\s*\n\s*', html.strip()) 

def test_render_html(): 
    expected = ['<div>', '...', '</div>'] 
    got = split_html(render_html()) 
    self.assertEqual(expected, got) 

如果我写工作代码的测试,我通常先设定expected = [],插入断言前self.maxDiff = None,让测试失败一次。预期列表可以从测试输出中复制粘贴。

您可能需要根据HTML的外观调整空白的剥离方式。