2017-08-13 48 views
1

我正在编写一个例程,它将从文件中检索URL列表,使用JSoup获取每个URL的内容,查找某些模式并将结果写入输出文件(一个用于分析每个URL)。在Java中使用ExecutorService和Callable编写文件不起作用

我有一个WebPageAnalysisTask(实现可赎回)和现在它返回null,但它会返回保存处理结果的对象(做):

public WebPageAnalyzerTask(String targetUrl, Pattern searchPattern) { 
    this.targetUrl = targetUrl; 
    this.searchPattern = searchPattern; 
} 

@Override 
public WebPageAnalysisTaskResult call() throws Exception { 
    long startTime = System.nanoTime(); 
    String htmlContent = this.getHtmlContentFromUrl(); 
    List<String> resultContent = this.getAnalysisResults(htmlContent); 

    try (BufferedWriter bw = Files.newBufferedWriter(Paths.get("c:/output", UUID.randomUUID().toString() + ".txt"), 
      StandardCharsets.UTF_8, StandardOpenOption.WRITE)) { 
     bw.write(parseListToLine(resultContent)); 
    } 

    long endTime = System.nanoTime(); 
    return null; 
} 

我写该文件使用NIO并尝试使用资源。

将使用该任务的代码如下:

/** 
* Starts the analysis of the Web Pages retrieved from the input text file using the provided pattern. 
*/ 
public void startAnalysis() { 
    List<String> urlsToBeProcessed = null; 

    try (Stream<String> stream = Files.lines(Paths.get(this.inputPath))) { 

     urlsToBeProcessed = stream.collect(Collectors.toList()); 

     if (urlsToBeProcessed != null && urlsToBeProcessed.size() > 0) { 
      List<Callable<WebPageAnalysisTaskResult>> pageAnalysisTasks = this 
        .buildPageAnalysisTasksList(urlsToBeProcessed); 
      ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE); 
      List<Future<WebPageAnalysisTaskResult>> results = executor.invokeAll(pageAnalysisTasks); 
      executor.shutdown(); 
     } else { 
      throw new NoContentToProcessException(); 
     } 

    } catch (Exception e) { 
     e.printStackTrace(); 
    } 
} 

/** 
* Builds a list of tasks in which each task will be filled with data required for the analysis processing. 
* @param urlsToBeProcessed The list of URLs to be processed. 
* @return A list of tasks that must be handled by an executor service for asynchronous processing. 
*/ 
private List<Callable<WebPageAnalysisTaskResult>> buildPageAnalysisTasksList(List<String> urlsToBeProcessed) { 
    List<Callable<WebPageAnalysisTaskResult>> tasks = new ArrayList<>(); 
    UrlValidator urlValidator = new UrlValidator(ALLOWED_URL_SCHEMES); 

    urlsToBeProcessed.forEach(urlAddress -> { 
     if (urlValidator.isValid(urlAddress)) { 
      tasks.add(new WebPageAnalyzerTask(urlAddress, this.targetPattern)); 
     } 
    }); 

    return tasks; 
} 

文件保存的URL列表被读取一次。 ExecutorService为每个URL创建任务,并将异步分析和写入结果文件。

现在正在读取文件,并且每个URL的HTML内容正在被分析并保存在一个字符串中。但是,该任务不是写入文件。所以我想知道那里会发生什么。

有人可以告诉我,如果我错过了什么吗?

在此先感谢。

+0

您正在使用'java.io.BufferedWriter'写入文件,而不是NIO。 – EJP

回答

1

也许你在下面try块得到一个例外

try (BufferedWriter bw = Files.newBufferedWriter(Paths.get("c:/output", UUID.randomUUID().toString() + ".txt"), 
     StandardCharsets.UTF_8, StandardOpenOption.WRITE)) { 
    bw.write(parseListToLine(resultContent)); 
} 

尝试将catch块添加到它,并打印异常,如果事情确实发生,看看是什么原因呢

catch (IOException e) { 
    // Replace with logger or some kind of error handling in production code 
    e.printStackTrace(); 
} 
+0

我的耻辱!我完全忘记了catch块,之后我发现在尝试在不存在的目录中写入文件时发生IOException。谢谢! –

1

由于该任务将在类WebPageAnalyzerTask中的方法call()上运行错误,所以您应该检查List<Future<WebPageAnalysisTaskResult>> results = executor.invokeAll(pageAnalysisTasks);的结果并确定任务运行时发生了什么错误。

for (Future<WebPageAnalysisTaskResult> future : results) { 
     try { 
      future.get(); 
     } catch (InterruptedException e) { 
      e.printStackTrace(); 
     } catch (ExecutionException e) { 
      e.printStackTrace(); 
     } 
    }