2013-08-16 121 views
0

我写了一个函数来读取外部API的一些数据。我的功能是,它从磁盘读取文件时调用该API。我想优化我的代码大尺寸的文件(35000记录)。你能否就此提出建议。从API中读取数据

以下是我的代码。

public void readCSVFile() { 

    try { 

     br = new BufferedReader(new FileReader(getFileName())); 

     while ((line = br.readLine()) != null) { 


      String[] splitLine = line.split(cvsSplitBy); 

      String campaign = splitLine[0]; 
      String adGroup = splitLine[1]; 
      String url = splitLine[2];    
      long searchCount = getSearchCount(url);    

      StringBuilder sb = new StringBuilder(); 
      sb.append(campaign + ","); 
      sb.append(adGroup + ",");    
      sb.append(searchCount + ",");    
      writeToFile(sb, getNewFileName()); 

     } 

    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 

private long getSearchCount(String url) { 
    long recordCount = 0; 
    try { 

     DefaultHttpClient httpClient = new DefaultHttpClient(); 

     HttpGet getRequest = new HttpGet(
       "api.com/querysearch?q=" 
         + url); 
     getRequest.addHeader("accept", "application/json"); 

     HttpResponse response = httpClient.execute(getRequest); 

     if (response.getStatusLine().getStatusCode() != 200) { 
      throw new RuntimeException("Failed : HTTP error code : " 
        + response.getStatusLine().getStatusCode()); 
     } 

     BufferedReader br = new BufferedReader(new InputStreamReader(
       (response.getEntity().getContent()))); 

     String output; 

     while ((output = br.readLine()) != null) { 
      try { 

       JSONObject json = (JSONObject) new JSONParser() 
         .parse(output); 
       JSONObject result = (JSONObject) json.get("result"); 
       recordCount = (long) result.get("count"); 
       System.out.println(url + "=" + recordCount); 

      } catch (Exception e) { 
       System.out.println(e.getMessage()); 
      } 

     } 

     httpClient.getConnectionManager().shutdown(); 

    } catch (Exception e) { 
     e.getStackTrace(); 
    } 
    return recordCount; 

} 
+0

你的瓶颈肯定会成为你的HTTP东西。我会优化这一点。如果可能,可能不会关闭连接或获得批量结果。 –

+0

有问题。问题是,我必须用一个来自文件的GET参数来调用这个API。 – Duleendra

回答

1

由于远程调用速度比本地磁盘访问慢,因此您需要以某种方式并行或批量远程调用。如果您无法批量调用远程API,但它允许多个并发读取,那么也许你想要使用像一个线程池来进行远程调用:

public void readCSVFile() { 
    // exception handling ignored for space 
    br = new BufferedReader(new FileReader(getFileName())); 
    List<Future<String>> futures = new ArrayList<Future<String>>(); 
    ExecutorService pool = Executors.newFixedThreadPool(5); 

    while ((line = br.readLine()) != null) { 
     final String[] splitLine = line.split(cvsSplitBy); 
     futures.add(pool.submit(new Callable<String> { 
      public String call() { 
       long searchCount = getSearchCount(splitLine[2]); 
       return new StringBuilder() 
        .append(splitLine[0]+ ",") 
        .append(splitLine[1]+ ",") 
        .append(searchCount + ",") 
        .toString(); 
      } 
     })); 
    } 

    for (Future<String> fs: futures) { 
     writeToFile(fs.get(), getNewFileName()); 
    } 

    pool.shutdown(); 
} 

理想的情况是,你”如果可能的话,d真的想从远程API中进行一次批量读取。

+0

感谢您的建议。顺便说一句,我无法进行单批阅读。但允许多个并发读取。 – Duleendra

+0

嗨DPM我累了你的解决方案,它的工作。 :) – Duleendra