2016-07-04 66 views
-1

我正在使用JSOUP包来获得像Facebook标题一样的特定TITLE搜索。这是我的代码,它给出了TITLE的输出。从TITLE的我想选择Facebook网址。如何使用java正则表达式分割一个单词?

方案:

package googlesearch; 

import java.io.IOException; 
import java.net.URLDecoder; 
import java.net.URLEncoder; 
import java.util.regex.Matcher; 
import java.util.regex.Pattern; 

import org.jsoup.Jsoup; 
import org.jsoup.nodes.Element; 
import org.jsoup.select.Elements; 

public class SearchRegexDiv { 
    private static String REGEX = ".?[facebook]"; 
    public static void main(String[] args) throws IOException { 

    Pattern p = Pattern.compile(REGEX); 
    String google = "http://www.google.com/search?q="; 
    //String search = "stackoverflow"; 
    String search = "hortonworks"; 
    String charset = "UTF-8"; 
    String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage! 

    Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a"); 

    for (Element link: links) { 
     String title = link.text(); 
     String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>". 
     url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8"); 

     if (!url.startsWith("http")) { 
     continue; // Ads/news/etc. 
     } 

     //.?facebook 
     if (title.matches(REGEX)) { 
     System.out.println("Done"); 
     title.substring(title.lastIndexOf(" ") + 1); //split the String 
     //(example.substring(example.lastIndexOf(" ") + 1)); 
     } 
     System.out.println("Title: " + title); 

     System.out.println("URL: " + url); 
    } 
    } 
} 

OUTPUT:

Title: Hortonworks - Facebook logo URL: https://www.facebook.com/hortonworks/

从输出我得到的上述格式的URL'sTITLE's列表。

我想匹配包含字Facebook的标题,我想将它拆分成两个串像

String socila_media = facebook; 

String org = hortonworks; 
+2

JAVA不是JavaScript,删除标签 – mplungjan

+1

可能我错过了一些东西,但这与perl有什么关系?删除了perl标签。 –

+0

也许perl正则表达式大师会有用:) – mplungjan

回答

0

使用这个代码,以分割你String使用多个字符

这里是一个Demo To Split character using multiple param

String word = "https://www.facebook.com/hortonworks/"; 
     String [] array = word.split("[/.]"); 
     for (String each1 : array) 
     System.out.println(each1); 

输出为

https: //each splitted word in different line. 
www 
facebook 
com 
hortonworks 
+0

谢谢@Kashyap它帮助了我 –

相关问题