2014-03-24 78 views
0

我想运行这段代码,我面临“空指针异常”在我的程序中。我用try和catch但我不知道如何消除这个问题。 下面是代码:在java中使用jsoup提取数据

import org.jsoup.Jsoup; 
import org.jsoup.nodes.Document; 
import java.net.*; 
import java.io.*; 
import java.lang.NullPointerException; 
public class WikiScraper { 

public static void main(String[] args) throws IOException 
{ 
scrapeTopic("/wiki/Python"); 
} 
public static void scrapeTopic(String url){ 
String html = getUrl("http://www.wikipedia.org/"+url); 
Document doc = Jsoup.parse(html); 

    String contentText = doc.select("#mw-content-text>p").first().text(); 
    System.out.println(contentText); 
    System.out.println("The url was malformed!"); 
} 
public static String getUrl(String url){ 
URL urlObj = null; 
try{ 
urlObj = new URL(url); 
} 
catch(MalformedURLException e){ 
System.out.println("The url was malformed!"); 
return ""; 
} 
URLConnection urlCon = null; 
BufferedReader in = null; 
String outputText = ""; 
try{ 
urlCon = urlObj.openConnection(); 
in = new BufferedReader(new InputStreamReader(urlCon.getInputStream())); 
String line = ""; 
while((line = in.readLine()) != null){ 
outputText += line; 
} 
in.close(); 
}catch(IOException e){ 
System.out.println("There was an error connecting to the URL"); 
return ""; 
} 
return outputText; 
} 
} 

所示的错误是:

There was an error connecting to the URL 
Exception in thread "main" java.lang.NullPointerException 
    at hello.WikiScraper.scrapeTopic(WikiScraper.java:17) 
    at hello.WikiScraper.main(WikiScraper.java:11) 
+0

请从您的代码中删除'
'标签并将每行缩进4格以正确格式化。 –

回答

1

你有

public static String getUrl(String url){ 
    // ... 
    return ""; 
} 

什么总是一个空字符串结束。

尝试

Document doc = Jsoup.connect("http://example.com/").get(); 

例如。