我的工作是采用HTML字符串的方法，并返回一个类似从创建的HTML（在Java中）的字符串

javax.swing.text.html.HTMLDocument

什么是这样做的最有效的方式的HTMLDocument的？

我目前这样做的方式是使用SAX解析器来解析HTML字符串。我跟踪何时打开标签（例如，<i>）。当我点击相应的关闭标记（例如，</i >）时，我将斜体样式应用于我之间打的字符。

这当然有效，但速度不够快。有没有更快的方法来做到这一点？

来源

2011-07-14 Paul Reiners

尝试使用HtmlEditorKit类。它支持解析可从String直接读取的HTML内容（例如，通过StringReader）。 There seems to be an article关于如何做到这一点。

编辑：举个例子，基本上我认为这可能是这样做（aftrer代码被执行，htmlDoc应该包含加载文件...）：

Reader stringReader = new StringReader(string); 
HTMLEditorKit htmlKit = new HTMLEditorKit(); 
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument(); 
HTMLEditorKit.Parser parser = new ParserDelegator(); 
parser.parse(stringReader, htmlDoc.getReader(0), true);

来源

2011-07-14 18:28:17 mouser

这看起来是正确的，但似乎并不奏效。考虑这个测试用例：公共无效testMakeHTMLDocument（）抛出异常{ \t \t最后字符串的HTML = “ \ n” 个 \t \t \t + “ \ n” 个 \t \t \t + “\ n” 个 \t \t \t + “

我的第一个标题

\ n” 个 \t \t \t + “\ n” \t \t \t +“

我的网络连接第一段。

\ n “个 \t \t \t + ”\ n“ 个 \t \t \t + ” \ n“ 个 \t \t \t +”“; \t \t最终HTMLDocument的HTMLDocument的= \t \t \t MyHTMLDocumentLoader.makeHTMLDocument（HTML）; \t \t htmlDocument.dump（System。出）; \t} –

它转储这样： <体名=身体 >

<内容名=含量 > [0,1] [ ] <比迪平 bidiLevel，则会= 0 > [0,1] [ ] –

我有点害怕，这是因为HTMLEditorKit支持HTML的弱点;根据javadoc的说法，“默认支持是由这个类提供的，它支持HTML版本3.2（带有一些扩展），并且正在向版本4.0迁移” - 恐怕你需要在回调中手动处理标签 - 不知道这是否比你的原始方法好一些:( – mouser

你可以尝试使用方法HTMLDocument.setOuterHTML。只需添加一个随机元素，然后将其替换为HTML字符串。

来源

2011-07-14 18:33:07 nfechner

只是不要忘记：'为了正确工作，文档必须有一个HTMLEditorKit.Parser集合。如果文档是通过createDefaultDocument方法从HTMLEditorKit创建的，那么就是这种情况。' – mouser

同意Mouser的，但小幅盘整

Reader stringReader = new StringReader(string); 
HTMLEditorKit htmlKit = new HTMLEditorKit(); 
HTMLDocument htmlDoc = (HTMLDocument) htmlKit.createDefaultDocument(); 
htmlKit.read(stringReader, htmlDoc, 0);

来源

2011-07-15 07:17:44 StanislavL

从创建的HTML（在Java中）的字符串

回答

我的第一个标题

相关问题