2011-09-30 151 views
2

我正在使用FreeTextBox.dll来获取用户输入,并将该信息以HTML格式存储在数据库中。用户输入的samle是如下:iTextSharp HTML to PDF保留空格

                                                                     133 Peachtree St NE
                                                                     Atlanta,  GA 30303
                                                                     404-652-7777

                                                                     Cindy Cooley
                                                                     www.somecompany.com
                                                                     Product Stewardship Mgr

                                                                    9/9/2011
Deidre's Company
123 Test St
Atlanta, GA 30303

Test test.

 

我想HTMLWorker持之以恒的白色空间的用户进入,但它剥离出来。有没有办法维持用户的空白空间?以下是我如何创建我的PDF文档的示例。

公共共享子CreatePreviewPDF(BYVAL vsHTML作为字符串,BYVAL vsFileName作为字符串)

 Dim output As New MemoryStream() 
     Dim oDocument As New Document(PageSize.LETTER) 
     Dim writer As PdfWriter = PdfWriter.GetInstance(oDocument, output) 
     Dim oFont As New Font(Font.FontFamily.TIMES_ROMAN, 8, Font.NORMAL, BaseColor.BLACK) 

     Using output 
      Using writer 
       Using oDocument 
        oDocument.Open() 
        Using sr As New StringReader(vsHTML) 
         Using worker As New html.simpleparser.HTMLWorker(oDocument) 

          worker.StartDocument() 
          worker.SetInsidePRE(True) 
          worker.Parse(sr) 
          worker.EndDocument() 
          worker.Close() 
          oDocument.Close() 

         End Using 
        End Using 

        HttpContext.Current.Response.ContentType = "application/pdf" 
        HttpContext.Current.Response.AddHeader("Content-Disposition", String.Format("attachment;filename={0}.pdf", vsFileName)) 
        HttpContext.Current.Response.BinaryWrite(output.ToArray()) 
        HttpContext.Current.Response.End() 

       End Using 
      End Using 
      output.Close() 
     End Using 


    End Sub 
+0

只要给你一些帮助 - 这可能是错误的,如果你将它重新标记为Visual Basic,可能会获得更多帮助。 – element119

回答

0

感谢大家的帮助。我能够做的找到周围的小工作如下:

vsHTML.Replace(" ", "&nbsp;&nbsp;").Replace(Chr(9), "&nbsp;&nbsp;&nbsp;&nbsp;").Replace(Chr(160), "&nbsp;").Replace(vbCrLf, "<br />") 

实际的代码不能正常显示,但是,第一个取而代之的是与&nbsp;代替空格,,并Chr(160)&nbsp;

0

我建议使用wkhtmltopdf代替iText的。 wkhtmltopdf将输出完全由webkit(Google Chrome,Safari)渲染的html代替iText的转换。这只是一个可以调用的二进制文件。话虽如此,我可能会检查html以确保用户输入中有段落和/或换行符。转换之前可能会将其删除。

+0

谢谢。我们决定采用http://www.html-to-pdf.net/ExpertPDF-HtmlToPdf-Converter.aspx。它效果很好。 – user973754

1

在iText和iTextSharp中有一个小故障,但如果您不介意下载源代码并重新编译它,您可以很容易地修复它。您需要对两个文件进行更改。我所做的任何更改都是在代码中内联注释的。行号基于5.1.2.0代码rev 240

第一个代码是iTextSharp.text.html.HtmlUtilities.cs。查找功能EliminateWhiteSpace在行249并将其更改为:

public static String EliminateWhiteSpace(String content) { 
     // multiple spaces are reduced to one, 
     // newlines are treated as spaces, 
     // tabs, carriage returns are ignored. 
     StringBuilder buf = new StringBuilder(); 
     int len = content.Length; 
     char character; 
     bool newline = false; 
     bool space = false;//Detect whether we have written at least one space already 
     for (int i = 0; i < len; i++) { 
      switch (character = content[i]) { 
      case ' ': 
       if (!newline && !space) {//If we are not at a new line AND ALSO did not just append a space 
        buf.Append(character); 
        space = true; //flag that we just wrote a space 
       } 
       break; 
      case '\n': 
       if (i > 0) { 
        newline = true; 
        buf.Append(' '); 
       } 
       break; 
      case '\r': 
       break; 
      case '\t': 
       break; 
      default: 
       newline = false; 
       space = false; //reset flag 
       buf.Append(character); 
       break; 
      } 
     } 
     return buf.ToString(); 
    } 

第二个变化是在iTextSharp.text.xml.simpleparser.SimpleXMLParser.cs。在功能Go在185行将248行改为:

if (html /*&& nowhite*/) {//removed the nowhite check from here because that should be handled by the HTML parser later, not the XML parser