c＃Regex.Match问题

2010-02-16 48 views 0 likes

我有一些问题与匹配的文本从HTML页面提取数据。这里是我到目前为止，但plainText保持为空：c＃Regex.Match问题

private void Scrape() 
{ 
    // create variables 
    string html; 
    string plainText; 

    // download page source 
    // sample URL: http://freekeywords.wordtracker.com/?seed=test&adult_filter=remove_offensive&suggest=Hit+Me"; 
    html = webBrowser1.Document.Body.InnerText; 

    // scrape keywords 
    plainText = Regex.Match(html, @"class='k'[^x]display: none""", RegexOptions.IgnoreCase).Groups[1].Value; 

    //plainText = Regex.Replace(plainText, @"\,", Environment.NewLine); 
    //plainText = Regex.Replace(plainText, @"""", ""); 

    this.richTextBox1.Text = html; 
}

来源

2010-02-16 Sanju

是否有充分的理由使用正则表达式来解析HTML而不是使用HTML解析器？ –

回答

您尝试从集团获得价值与指数1，但你的正则表达式不包含任何组。用户组[0]，或简单地Match.Value。

来源

2010-02-16 10:22:45 necrostaz

顺便说一下，我怀疑你的html确实包含类似的代码片段'k [not x] display：none“ – necrostaz

c＃Regex.Match问题

回答

相关问题