2015-11-01 32 views
0

比较1和第5串作为一个项目我工作的一部分删除重复,我想清理我的文件生成重复的行条目。然而,这些副本经常不会发生在彼此附近。我想出了在Java中(这基本上找到文件中重复这样的方法,我存储两串两分的ArrayLists和迭代,但它不是因为嵌套的for循环我正在渐入状态manyways工作。需要找到从文本文件从每一行

我需要为这个集成的解决方案,但是,最好在Java中。任何想法? 列表项

public class duplicates { 
     static BufferedReader reader = null; 
     static BufferedWriter writer = null; 
     static String currentLine; 

     public static void main(String[] args) throws IOException { 
      int count=0,linecount=0;; 
      String fe = null,fie = null,pe=null; 
      File file = new File("E:\\Book.txt"); 

      ArrayList<String> list1=new ArrayList<String>(); 
      ArrayList<String> list2=new ArrayList<String>(); 

      reader = new BufferedReader(new FileReader(file)); 

      while((currentLine = reader.readLine()) != null) 
      { 
       StringTokenizer st = new StringTokenizer(currentLine,"/"); //splits data into strings 
       while (st.hasMoreElements()) { 
        count++; 
        fe=(String) st.nextElement(); 
        //System.out.print(fe+"/// "); 

        //System.out.println("count="+count); 
        if(count==1){           //stores 1st string 
         pe=fe; 
         // System.out.println("first element "+fe); 
        } 
        else if(count==5){ 
         fie=fe;            //stores 5th string 
         // System.out.println("fifth element "+fie); 
        } 
       } 
       count=0; 

       if(linecount>0){ 
        for(String s1:list1) 
        { 
         for(String s2:list2){ 
          if(pe.equals(s1)&&fie.equals(s2)){        //checking condition 
           System.out.println("duplicate found"); 
           //System.out.println(s1+ " "+s2); 
          }   
         } 
        } 
       }      
       list1.add(pe); 
       list2.add(fie); 
       linecount++; 
      } 
     } 
    } 

i/p: 

/book1/_cwc/B737/customer/Special_Reports/ 
/Airbook/_cwc/A330-200/customer/02_Watchlists/ 
/book1/_cwc/B737/customer/Special_Reports/ 
/jangeer/_cwc/Crj_200/customer/plots/ 
/Airbook/_cwc/A330-200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/06_Performance_Summaries/ 
/jangeer/_cwc/Crj_200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/01_Highlights/ 
/jangeer/_cwc/ERJ170/customer/01_Highlights/ 

o/p: 

/book1/_cwc/B737/customer/Special_Reports/ 
/Airbook/_cwc/A330-200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/plots/ 
/jangeer/_cwc/Crj_200/customer/06_Performance_Summaries/ 
/jangeer/_cwc/Crj_200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/01_Highlights/ 
+0

什么不工作?我看到,输入重复的'/ Airbook/...'在输出 –

+0

得到了消除这里我不是试图删除重复基于一个字符串,但我与1弦和第5 string.ie的组合检查,对1弦第5弦不应该是duplicate.example:对于第一行“book1”是第一个字符串,第五个字符串是“特殊报告”,因此第三行应该被删除。 – kish

回答

1

使用的Set<String>代替Arraylist<String>

Set中不允许有重复项,所以如果您只是将everyline添加到它中,然后将它们取出,则您将拥有所有不同的字符串。

性能方面它也比你的嵌套的循环更快。

+0

应该使用SortedSet,因为使用Set不能保证原始顺序被保留。 – Sneh

+0

@Sneh - 订单是否重要?我们只是删除dups ... –

+1

但给出的输出是为了所以我建议。您可以指出订单将会丢失。 – Sneh

1
public static void removeDups() { 
     String[] input = new String[] { //Lets say you read whole file in this string array 
       "/book1/_cwc/B737/customer/Special_Reports/", 
       "/Airbook/_cwc/A330-200/customer/02_Watchlists/", 
       "/book1/_cwc/B737/customer/Special_Reports/", 
       "/jangeer/_cwc/Crj_200/customer/plots/", 
       "/Airbook/_cwc/A330-200/customer/02_Watchlists/", 
       "/jangeer/_cwc/Crj_200/customer/06_Performance_Summaries/", 
       "/jangeer/_cwc/Crj_200/customer/02_Watchlists/", 
       "/jangeer/_cwc/Crj_200/customer/01_Highlights/", 
       "/jangeer/_cwc/ERJ170/customer/01_Highlights/" 
     }; 
     ArrayList<String> outPut = new ArrayList<>(); //The array list for storing output i.e. distincts. 
     Arrays.stream(input).distinct().forEach(x -> outPut.add(x)); //using java 8 and stream you get distinct from input 
     outPut.forEach(System.out::println); //I will write back to the file, just for example I am printing out everything but you can write back the output to file using your own implementation. 
    } 

输出,当我跑这个方法是

/book1/_cwc/B737/customer/Special_Reports/ 
/Airbook/_cwc/A330-200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/plots/ 
/jangeer/_cwc/Crj_200/customer/06_Performance_Summaries/ 
/jangeer/_cwc/Crj_200/customer/02_Watchlists/ 
/jangeer/_cwc/Crj_200/customer/01_Highlights/ 
/jangeer/_cwc/ERJ170/customer/01_Highlights/ 

编辑

不支持Java 8回答

public static void removeDups() { 
     String[] input = new String[] { 
       "/book1/_cwc/B737/customer/Special_Reports/", 
       "/Airbook/_cwc/A330-200/customer/02_Watchlists/", 
       "/book1/_cwc/B737/customer/Special_Reports/", 
       "/jangeer/_cwc/Crj_200/customer/plots/", 
       "/Airbook/_cwc/A330-200/customer/02_Watchlists/", 
       "/jangeer/_cwc/Crj_200/customer/06_Performance_Summaries/", 
       "/jangeer/_cwc/Crj_200/customer/02_Watchlists/", 
       "/jangeer/_cwc/Crj_200/customer/01_Highlights/", 
       "/jangeer/_cwc/ERJ170/customer/01_Highlights/" 
     }; 

     LinkedHashSet<String> output = new LinkedHashSet<String>(Arrays.asList(input)); //output is your set of unique strings in preserved order 

    } 
+0

您可以提供非Java8答案 –

相关问题