如何计算.txt文件中波兰语字符的出现

我必须准备一个.txt文件并计算文件中每个字母字符的出现次数。我找到了一段非常好的代码，但不幸的是，它不适用于像±，ę，ć，ó，ż，Ž这样的波兰字符。即使我把它们放在数组中，出于某种原因，它们在.txt文件中找不到，所以输出为0.如何计算.txt文件中波兰语字符的出现

有没有人知道为什么？也许我应该用不同的方式来计算它们，用“Switch”或类似的东西。有人问之前 - 是的，.txt文件保存使用UTF-8 :)

public static void main(String[] args) throws FileNotFoundException { 
     int ch; 
     BufferedReader reader; 
     try { 
      int counter = 0; 

      for (char a : "AĄĆĘÓBCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray()) { 
       reader = new BufferedReader(new FileReader("C:\\Users\\User\\Desktop\\pan.txt")); 
       char toSearch = a; 
       counter = 0; 

       try { 
        while ((ch = reader.read()) != -1) { 
         if (a == Character.toUpperCase((char) ch)) { 
          counter++; 
          } 
        } 

       } catch (IOException e) { 
        System.out.println("Error"); 
        e.printStackTrace(); 
       } 
       System.out.println(toSearch + " occurs " + counter); 

      } 
     } catch (FileNotFoundException e) { 
      e.printStackTrace(); 
     } 
    }

来源

2017-05-28 Wojciech Miśta

如果测试文件是UTF8编码的，为什么不用UTF8编码读取它，而不是使用平台的默认字符编码？你是否进行了基本的调试，比如打印（或者用调试器检查）你阅读的每个字符，打印（或者用调试器检查）它的大写值？ –

请参阅[计算字符串中每个字符的数量]（https://codereview.stackexchange.com/q/44186/88267）或者[计数每个唯一字符的出现次数]（https://stackoverflow.com/q/ 4112111/5221149），这种方式不会多次扫描整个文件。 – Andreas

@JBNizet答案的简短版本 - 老师告诉我们这样做 - - - 我猜她并不指望它不起作用。 Aaaaand nope，但使用“InputStreamReader”有帮助。 –

看起来你的有关编码和默认的系统问题charset

尝试改变读者变量此

InputStreamReader reader = new InputStreamReader(new FileInputStream("C:\\Users\\User\\Desktop\\pan.txt"), "UTF-8");

来源

2017-05-28 14:48:20 Neonailol

谢谢！有用！ –

试试这个：我建议你使用NIO这个代码，我用你的NIO写，RandomAccessFile的和MappedByteBuffer那就是速度快：

import java.io.IOException; 
import java.io.RandomAccessFile; 
import java.nio.MappedByteBuffer; 
import java.nio.channels.FileChannel; 
import java.util.HashMap; 
import java.util.Map; 

public class FileReadNio 
{ 
public static void main(String[] args) throws IOException 
{ 
    Map<Character, Integer> charCountMap = new HashMap<>(); 

    RandomAccessFile rndFile = new RandomAccessFile 
      ("c:\\test123.txt", "r"); 
    FileChannel inChannel = rndFile.getChannel(); 
    MappedByteBuffer buffer = inChannel.map(FileChannel.MapMode.READ_ONLY, 0, inChannel.size()); 
    buffer.load(); 
    for (int i = 0; i < buffer.limit(); i++) 
    { 

     char c = (char) buffer.get(); 

     if (charCountMap.get(c) != null) { 
     int cnt = charCountMap.get(c); 
      charCountMap.put(c, ++cnt); 

     } 
     else 
     { 
      charCountMap.put(c, 1); 
     } 
    } 

    for (Map.Entry<Character,Integer> characterIntegerEntry : charCountMap.entrySet()) { 

     System.out.printf("char: %s :: count=%d", characterIntegerEntry.getKey(), characterIntegerEntry.getValue()); 
     System.out.println(); 
    } 

    buffer.clear(); 
    inChannel.close(); 
    rndFile.close(); 
} 
}

来源

2017-05-28 15:19:12

如何计算.txt文件中波兰语字符的出现

回答

相关问题