2016-12-05 168 views
0

编辑:下面提出了完美的解决方案(流以错误的顺序关闭)。我最终选择了PreMailer.Net + HtmlAgilityPack + wkHTMLtoPDF的开源替代方案,因为它更符合我的需求。执行iTextSharp将HTML转换为PDF的难题

我试图在C#中实现iTextSharp将HTML转换为PDF文件,包括转换链接和图像的相对URI。我有一个非常基本的实现“更改默认配置”(http://demo.itextsupport.com/xmlworker/itextdoc/flatsite.html),从Java转换为C#,以尝试一下。然而,当通过一个文本编辑器编辑的样本HTML(我已经测试过),我送入我的脚本返回我创建的PDF以下内容:

%PDF-1.4 
%âãÏÓ 

这似乎是错误的。另外,MemoryStream只有很少的字节与它关联。我的iTextSharp实现有问题,或者我使用流或其他C#构造不正确?

using System.IO; 
using System.Text; 
using iTextSharp.text; 
using iTextSharp.text.pdf; 
using iTextSharp.tool.xml.html; 
using iTextSharp.tool.xml.pipeline.html; 
using iTextSharp.tool.xml; 
using iTextSharp.tool.xml.parser; 
using iTextSharp.tool.xml.pipeline.css; 
using iTextSharp.tool.xml.pipeline.end; 

class Program 
{ 
    static void Main(string[] args) 
    { 
     FontFactory.RegisterDirectories(); 
     var document = new Document(); 
     var memoryStream = new MemoryStream(); 
     var pdfWriter = PdfWriter.GetInstance(document, memoryStream); 
     document.Open(); 

     var htmlContext = new HtmlPipelineContext(null); 
     htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory()); 
     htmlContext.SetImageProvider(new ImageProvider()); 
     htmlContext.SetLinkProvider(new LinkProvider()); 
     htmlContext.CharSet(Encoding.UTF8); 

     var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true); 
     var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, pdfWriter))); 
     var xmlWorker = new XMLWorker(pipeline, true); 
     var xmlParser = new XMLParser(xmlWorker); 

     var inputFileStream = new FileStream("testHTML.html", FileMode.Open); 
     xmlParser.Parse(inputFileStream); 
     inputFileStream.Close(); 

     memoryStream.Position = 0; 
     pdfWriter.CloseStream = false; 

     var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write); 
     memoryStream.WriteTo(outputFileStream); 

     outputFileStream.Close(); 
     document.Close(); 
    } 
} 

class ImageProvider : AbstractImageProvider 
{ 
    public override string GetImageRootPath() 
    { 
     return "testDir/"; 
    } 
} 

class LinkProvider : ILinkProvider 
{ 
    public string GetLinkRoot() 
    { 
     return "http://www.examplesite.com/testdir/"; 
    } 
} 

非常感谢您的时间和帮助!

memoryStream.WriteTo(outputFileStream); 

    outputFileStream.Close(); 
    document.Close(); 

但关闭文档时,才iText的完成输出PDF,特别是冲洗当前最后一页的内容,并补充说:

+0

我没有看到你正在写'pdfWriter'的任何东西。你期望它打印什么? –

+0

我曾打算将HTML的内容打印为PDF –

+0

但是,您得到'testHTML.html',但从不对数据做任何事情。 –

回答

1

您关闭iText的document前抢内存流的内容交叉引用等

因此,你的代码

memoryStream.Position = 0; 
    pdfWriter.CloseStream = false; 

    var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write); 
    memoryStream.WriteTo(outputFileStream); 

    outputFileStream.Close(); 
    document.Close(); 

改变这个

pdfWriter.CloseStream = false; 
    document.Close(); 

    var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write); 
    memoryStream.Position = 0; 
    memoryStream.WriteTo(outputFileStream); 
    outputFileStream.Close();