将HTML解析为PDF的iText样式

我已经按照这个链接：How to export html page to pdf format?

我的片段：

String str = "<html><head><body><div style=\"width:100%;height:100%;\"><h3 style=\"margin-left:5px;margin-top:40px\">First</h3><div style=\"margin-left:15px;margin-top:15px\"><title></title><p>sdasdasd shshshshdffgdfgd</p></div><h3 style=\"margin-left:5px;margin-top:40px\">The dream</h3><div style=\"margin-left:15px;margin-top:15px\"></div></div></body></head></html>"; 
    String fileNameWithPath = "/Users/cecco/Desktop/pdf2.pdf"; 


    com.itextpdf.text.Document document = 
      new com.itextpdf.text.Document(com.itextpdf.text.PageSize.A4); 
    FileOutputStream fos = new FileOutputStream(fileNameWithPath); 
    com.itextpdf.text.pdf.PdfWriter pdfWriter = 
      com.itextpdf.text.pdf.PdfWriter.getInstance(document, fos); 

    document.open(); 

    document.addAuthor("Myself"); 
    document.addSubject("My Subject"); 
    document.addCreationDate(); 
    document.addTitle("My Title"); 

    com.itextpdf.text.html.simpleparser.HTMLWorker htmlWorker = 
      new com.itextpdf.text.html.simpleparser.HTMLWorker(document); 
    htmlWorker.parse(new StringReader(str.toString())); 

    document.close(); 
    fos.close();

和做工精细。

但标签样式转换成h3和div不被考虑。

enter image description here

但是，如果我复制我的HTML为http://htmledit.squarefree.com/一切是正确的。

我该如何解决这个问题？

来源

2012-12-11 CeccoCQ

iText不是最好的Html解析器，但是您可以使用Flying-Saucer。飞碟是建立在iText的基础之上，但具有一个功能强大的Xml /（X）Html解析器。短：飞碟是完美的，如果你想HTML - > Pdf。

下面是如何从字符串生成PDF：

/* 
* Note: i filled something in the title-tag and fixed the head tag (the whole body-tag was in the head) 
*/ 
String str = "<html><head></head><body><div style=\"width:100%;height:100%;\"><h3 style=\"margin-left:5px;margin-top:40px\">First</h3><div style=\"margin-left:15px;margin-top:15px\"><title>t</title><p>sdasdasd shshshshdffgdfgd</p></div><h3 style=\"margin-left:5px;margin-top:40px\">The dream</h3><div style=\"margin-left:15px;margin-top:15px\"></div></div></body></html>"; 

OutputStream os = new FileOutputStream(new File("example.pdf")); 

ITextRenderer renderer = new ITextRenderer(); 
renderer.setDocumentFromString(str); 
renderer.layout(); 
renderer.createPDF(os); 

os.close();

但是： FS仅支持有效 HTML/XHTML/XML，所以使舒尔它。

来源

2013-02-04 17:45:30 ollo

更改为飞碟并在此答案中使用它解决了所有我的HTML解析PDF问题。正如奥洛指出的那样，你应该先“清理”字符串才能真正成为有效的HTML。我用Jsoup解析html，为此。 –

将HTML解析为PDF的iText样式

回答

相关问题