2017-02-09 43 views
0

我正在尝试使用表情符号的html代码或十六进制代码使用Java将带有表情符号内容的文本文件转换为文件。 例如:将表情符号转换为HTML十进制代码或Unicode十六进制代码在java中

I/P:<div id="thread" style="white-space: pre-wrap;"><div>⚽️

预期O/P:<div id="thread" style="white-space: pre-wrap;"><div>😀😀😃🍎🍏⚽️🏀

在上述出放''应该得到改变到相应的HTML实体代码'& # 128512;'

详细的Html实体代码和十六进制代码在这里给出: http://character-code.com/emoticons-html-codes.php

示例代码我试着低于:

try { 
      File file = new File("/inFile.txt"); 
      str = FileUtils.readFileToString(file, "ISO-8859-1"); 
      System.out.println(new String(str.getBytes(), "UTF-8")); 
      String results = StringEscapeUtils.escapeHtml4(str); 
      System.out.println(results); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
+1

所以你的代码做一些事情,你不告诉我们的代码,然后问为什么代码不能正常工作? *真的吗?!?!?* – Andreas

+0

添加了我试过的示例代码。 –

+1

你确定该文件使用ISO-8859-1编码吗?这似乎......不太可能。 – dnault

回答

0
I got the work around : 
public static void htmlDecimalCodeGenerator() { 

    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); 

    domFactory.setValidating(false); 

    // File inputFile = new File("/inputFile.xml"); 
    File inputFile = new File("/inputFile.xml"); 



    try { 

    FileOutputStream fop = null; 

    File OutFile = new File("/outputFile.xml"); 

    fop = new FileOutputStream(OutFile); 



    DocumentBuilder builder = domFactory.newDocumentBuilder(); 

    Document doc = builder.parse(inputFile); 



    TransformerFactory tf = TransformerFactory.newInstance(); 

    Transformer transformer = tf.newTransformer(); 



    /* 
    no value of OMIT_XML_DECLARATION will add following xml declaration in the beginning of the file. 
    <?xml version='1.0' encoding='UTF-32'?> 
    */ 
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes"); 



    /* 

    When the output method is "xml", the version value specifies the 
    version of XML to be used for outputting the result tree. The default 
    value for the xml output method is 1.0. When the output method is 
    "html", the version value indicates the version of the HTML. 
    The default value for the xml output method is 4.0, which specifies 
    that the result should be output as HTML conforming to the HTML 4.0 
    Recommendation [HTML]. If the output method is "text", the version 
    property is ignored 
    */ 
    transformer.setOutputProperty(OutputKeys.METHOD, "xml"); 



    /* 
    Indent-- specifies whether the Transformer may 
    add additional whitespace when outputting the result tree; the value 
    must be yes or no. 
    */ 
    transformer.setOutputProperty(OutputKeys.INDENT, "no"); 





    transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1"); 

    // transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4"); 

    transformer.transform(new DOMSource(doc), 

    new StreamResult(new OutputStreamWriter(System.out, "UTF-8"))); 

    // new StreamResult(new OutputStreamWriter(fop, "UTF-8"))); 


    } catch (Exception e) { 

    e.printStackTrace(); 

    } 

} 

} 
相关问题