2016-09-21 100 views
0

请考虑下面的类:单元测试XML编码问题

public class SampleXmlGenerator 
{ 
    public byte[] GenerateDocumentBytes() 
    { 
     byte[] fileBytes; 
     using (var xmlStream = new MemoryStream()) 
     { 
      using (var myWriter = new XmlTextWriter(xmlStream, Encoding.GetEncoding("UTF-8"))) 
      { 
       myWriter.Formatting = Formatting.Indented; 
       myWriter.Indentation = 4; 
       myWriter.IndentChar = ' '; 
       myWriter.WriteStartDocument(); 
       myWriter.WriteStartElement("foo"); 
       myWriter.WriteString("bar"); 
       myWriter.WriteEndElement(); // end foo 

       myWriter.Flush(); 

       fileBytes = xmlStream.ToArray(); 
      } 
     } 

     return fileBytes; 
    } 
} 

有了以下的单元测试:

[TestClass] 
public class TestSampleXmlGenerator 
{ 
    [TestMethod] 
    public void TextEmptyDocument() 
    { 
     var actualBytes = new SampleXmlGenerator().GenerateDocumentBytes(); 
     var actualUtf8String = Encoding.UTF8.GetString(actualBytes); 
     Console.Out.WriteLine("// actualUtf8String"); 
     Console.Out.WriteLine(actualUtf8String); 

     var actualDefaultString = Encoding.Default.GetString(actualBytes); 
     Console.Out.WriteLine("// actualDefaultString"); 
     Console.Out.WriteLine(actualDefaultString); 


     var expectedString = @"<?xml version=""1.0"" encoding=""utf-8""?> 
<foo>bar</foo>"; 
     var expectedBytes = Encoding.UTF8.GetBytes(expectedString); 

//  var expectedBytes = Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(@"<?xml version=""1.0"" encoding=""utf-8""?> 
//<foo>bar</foo>")); 
//  var expectedString = Encoding.UTF8.GetString(expectedBytes); 

     Console.Out.WriteLine("// expectedString"); 
     Console.Out.WriteLine(expectedString); 

     Assert.AreEqual(expectedBytes.Length, actualBytes.Length); 
     //Assert.AreEqual(expectedString, actualUtf8String); 
    } 
} 

最后输出:

Assert.AreEqual failed. Expected:<54>. Actual:<57>. 

// actualUtf8String 
<?xml version="1.0" encoding="utf-8"?> 
<foo>bar</foo> 

// actualDefaultString 
<?xml version="1.0" encoding="utf-8"?> 
<foo>bar</foo> 

// expectedString 
<?xml version="1.0" encoding="utf-8"?> 
<foo>bar</foo> 

expectedStringactualUtf8String看起来是一样的,但不是。

actualDefaultString显示开头的3个额外字符。

那么是什么给?我如何去测试/比较生成的XML?我应该做什么不同?

+1

有一个构造函数https://msdn.microsoft.com/en-us/library/s064f8w2(v=vs.110).aspx消除字节顺序标记。 –

+0

感谢Martin,但我不想从文档中删除BOM,只是在测试期间忽略它。但是,您的链接确实指向了正确的方向,寻找什么。 – CrnaStena

+0

为什么不创建一个测试函数来检查BOM并将其余字符串作为字符串返回? –

回答

0

基于从马丁和基思的建议,并与一些额外的研究,我结束了从生成的XML字节的单元测试去除BOM,以下列方式(基于以下SO article):

 var xmlBytes = new SampleXmlGenerator().GenerateDocumentBytes(); 
     var newXmlDoc = new XmlDocument {PreserveWhitespace = true}; 
     newXmlDoc.Load(new MemoryStream(xmlBytes)); 
     var actualBytes = Encoding.UTF8.GetBytes(newXmlDoc.OuterXml); 

现在单元测试经过!