将InputStream读取为UTF-8

我试图通过互联网逐行读取text/plain文件。我现在所拥有的代码是：将InputStream读取为UTF-8

URL url = new URL("http://kuehldesign.net/test.txt"); 
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream())); 
LinkedList<String> lines = new LinkedList(); 
String readLine; 

while ((readLine = in.readLine()) != null) { 
    lines.add(readLine); 
} 

for (String line : lines) { 
    out.println("> " + line); 
}

文件，test.txt，包含¡Hélló!，我使用，以测试其编码。

当我查看OutputStream（out）时，我将其视为> ¬°H√©ll√≥!。我不认为这是OutputStream的问题，因为我可以做out.println("é");没有问题。

读取形式为InputStream为UTF-8的任何想法？谢谢！

来源

2011-02-11 Chris Kuehl

HTTP协议指定编码。你为什么不使用库API来处理它？你不应该像这样猜测编码。我不是故意否定的：你做得很好！我只是想知道是否没有更简单的方法。 – tchrist 2011-02-11 01:25:51

不幸的是，我将无法访问服务于“text/plain”文件的服务器，并且它不使用UTF-8编码。我没有意识到任何好的网络库;有什么建议么？ – 2011-02-11 01:39:19

看着[文档]（http://download.oracle.com/javase/6/docs/api/java/net/URL.html），我不认为你将不得不指定编码。我很惊讶他们给你一个字节流！您可以访问底层的[URLConnection]（http://download.oracle.com/javase/6/docs/api/java/net/URLConnection.html），您可以从中检查Content-Encoding，然后打开带有正确参数的InputStreamReader。对源代码的快速检查并没有发现任何似乎对你有用的事情，这看起来很蹩脚且容易出错，所以我可能错过了一些东西。 – tchrist 2011-02-11 01:48:29

146

解决了我自己的问题。这条线：

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

需求是：

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));

或自从Java 7：

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), StandardCharsets.UTF_8));

来源

2011-02-11 01:18:41

String file = ""; 

    try { 

     InputStream is = new FileInputStream(filename); 
     String UTF8 = "utf8"; 
     int BUFFER_SIZE = 8192; 

     BufferedReader br = new BufferedReader(new InputStreamReader(is, 
       UTF8), BUFFER_SIZE); 
     String str; 
     while ((str = br.readLine()) != null) { 
      file += str; 
     } 
    } catch (Exception e) { 

    }

试试这个，.. :-)

来源

2013-06-17 10:42:10 Rohith

我跑每遇到一个特殊的字符就将它标记为into。要解决这个问题，我尝试使用的编码：ISO-8859-1

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1")); 

while ((line = br.readLine()) != null) { 

}

我希望这可以帮助任何人谁看到这个帖子。

来源

2018-03-02 14:32:04

将InputStream读取为UTF-8

回答

相关问题