2017-07-31 30 views
-1

我需要帮助来创建分割代码行然后才能进行拼写检查的代码。如何将缓冲读取器中的行分割为单词

public static void main(String [] args) throws IOException { 
    Stem myStem = new Stem(); 

    BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream("C:\\Users\\lamrh\\IdeaProjects\\untitled1\\src\\bigON\\data.txt"))); 

    //String currentWord = String.valueOf(bufferedReader.readLine()); 
    Scanner scanner = new Scanner(bufferedReader.readLine()); 
    //byte[] data = new byte [currentWord.length()]; 
    String[] splitLines; 
    //splitLines = splitLines.split(" "); 


    String line; 
    while((line = bufferedReader.readLine()) !=null ){ 
     //splitLines = line.split(" "); 
     String currentWord1 = formatWordGhizou (line); 
     System.out.println(""+ line+""+ ":"+ currentWord1); 

    } 
    bufferedReader.close(); 


} 

凡结果表明我:

سْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم 

سْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم

,它应该看起来像一个字一个字不字线。 任何帮助 谢谢。

+0

你能否提供函数“formatWordGhizou()”的来源? –

+0

[为什么“有人可以帮我吗?”不是一个真正的问题?](http://meta.stackoverflow.com/q/284236) – EJoshuaS

+0

问题是有什么办法可以将已被bufferedreader读取的行分割成话 –

回答

-1
// format the word by removing any punctuation, diacritics and non-letter charracters 
private static String formatWordGhizou (String currentWord) 
{ 
    StringBuffer modifiedWord = new StringBuffer (); 


    // remove any diacritics (short vowels) 
    if (removeDiacritics(currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString (); 
    } 

    // remove any punctuation from the word 
    if (removePunctuation(currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString () ; 
    } 

    // there could also be characters that aren't letters which should be removed 
    if (removeNonLetter (currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString (); 
    } 

    // check for stopwords 
    if(!checkStrangeWords (currentWord)) 
     // check for stopwords 
     if(!checkStopwords (currentWord)) 
      currentWord = stemWord (currentWord); 

    return currentWord; 
} 

//----------------- 
0

在while循环尝试串接线串进行,使用正则表达式来填充字符串数组splitLines然后通过阵列splitLines迭代分割线发送元件到标准输出如下(adapted from helpful tutorial at this link

String lines=""; 

while((line = bufferedReader.readLine()) !=null ){ 

    lines = lines + line; 

} 

String[] splitLines = lines.split("\\s+"); 

for (String words: splitLines) { 

    System.out.println(words); 

    } 
相关问题