如何真正将字符串拆分为字符串数组而不会丢失它在C＃中的部分？

我有什么如何真正将字符串拆分为字符串数组而不会丢失它在C＃中的部分？

string ImageRegPattern = @"http://[\w\.\/]*\.jpg|http://[\w\.\/]*\.png|http://[\w\.\/]*\.gif"; 
string a ="http://www.dsa.com/asd/jpg/good.jpgThis is a good dayhttp://www.a.com/b.pngWe are the Best friendshttp://www.c.com";

我想要什么

string[] s; 
s[0] = "http://www.dsa.com/asd/jpg/good.jpg"; 
s[1] = "This is a good day"; 
s[2] = "http://www.a.com/b.png"; 
s[3] = "We are the Best friendshttp://www.c.com";

BOUNS：
如果URL可以splited像下面，它会更好，但如果没有，也没关系。

s[3] = "We are the Best friends"; 
s[4] = "http://www.c.com";

有什么问题
我尝试使用下面的代码分割字符串，

string[] s= Regex.Split(sourceString, ImageRegPattern, RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);

但结果并不好，似乎分裂法采取了所有的与ImageRegPattern匹配的字符串。但我希望他们留下来。我检查了MSDN上的RegEx页面，似乎没有适合我需要的方法。那么该怎么做呢？

来源

2013-05-29 Albert Gao

我不认为有任何通用的解决方案来拆分该字符串（当然，你可以制定一些方法来做到这一点，但它会非常具体）。您从RegEx获得任何回报，因为它在比赛中分裂。我个人会改变字符串的格式，除非有一个很好的理由不应该给字符串添加一个分隔符。 – evanmcdonnal

给定一个以逗号分隔的列表，'Regex.Split（“1,2,3”，“，”）'将返回数组''“1”，“2”，“3”]'。你提供的模式定义了分隔符，而不是你想要保留的。 'Regex.Split'不是你想在这里使用的。你试图保留文本*和*分隔符，这不是'Split'的作用。 –

你需要类似这种方法的东西，它首先找到所有匹配，然后将它们与它们之间的不匹配字符串一起收集到列表中。

更新：添加了条件处理，如果没有找到匹配。

private static IEnumerable<string> InclusiveSplit 
(
    string source, 
    string pattern 
) 
{ 
    List<string> parts = new List<string>(); 
    int currIndex = 0; 

    // First, find all the matches. These are your separators. 
    MatchCollection matches = 
     Regex.Matches(source, pattern, 
     RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace); 

    // If there are no matches, there's nothing to split, so just return a 
    // collection with just the source string in it. 
    if (matches.Count < 1) 
    { 
    parts.Add(source); 
    } 
    else 
    { 
    foreach (Match match in matches) 
    { 
     // If the match begins after our current index, we need to add the 
     // portion of the source string between the last match and the 
     // current match. 
     if (match.Index > currIndex) 
     { 
     parts.Add(source.Substring(currIndex, match.Index - currIndex)); 
     } 

     // Add the matched value, of course, to make the split inclusive. 
     parts.Add(match.Value); 

     // Update the current index so we know if the next match has an 
     // unmatched substring before it. 
     currIndex = match.Index + match.Length; 
    } 

    // Finally, check is there is a bit of unmatched string at the end of the 
    // source string. 
    if (currIndex < source.Length) 
     parts.Add(source.Substring(currIndex)); 
    } 

    return parts; 
}

您例如输入输出会像这样：

[0] "http://www.dsa.com/asd/jpg/good.jpg" 
[1] "This is a good day" 
[2] "http://www.a.com/b.png" 
[3] "We are the Best friendshttp://www.c.com"

来源

2013-05-29 19:01:06 FishBasketGordo

真的非常感谢！ –

我认为你需要一个多步骤的过程中插入，然后可以通过String.Split命令中使用的分隔符：

resultString = Regex.Replace(rawString, @"(http://.*?/\w+\.(jpg|png|gif))", "|$1|", RegexOptions.IgnoreCase); 
if (a.StartsWith("|") 
    a = a.Substring(1); 
string a = resultString.Split('|');

来源

2013-05-29 18:59:53

这里最明显的答案当然是不使用分裂，而是匹配图像模式并检索它们。这就是说，使用分割并不是不可能的。

string ImageRegPattern = @"(?=(http://[\w./]*?\.jpg|http://[\w./]*?\.png|http://[\w./]*?\.gif))|(?<=(\.jpg|\.png|\.gif))"

这将匹配要么后跟一个图像的URL，或者由.jpg，.gif或.png preceeded一个点在字符串中的任何点。

我真的不建议这样做，我只是说你可以。

来源

2013-05-29 18:59:53 melwil

一个并不简单地低估regex功率：

(.*?)([A-Z][\w\s]+(?=http|$))

说明：

(.*?)：组搭配一切，直到大写字母发现，在这组，你会发现URL
(：启动组
- [A-Z]：匹配一个大写字母
- [\w\s]+：匹配任何字符az，0-9，_，\ n，\ r，\ t，\ f“”1次或多次
- (?=http|$)：先行，检查后面是http或行尾
- )：关闭组（在这里你会发现文本）

Online demo

_{注：该解决方案是相匹配的字符串，而不是分裂它。}

来源

2013-05-29 19:15:54 HamZa

如何真正将字符串拆分为字符串数组而不会丢失它在C＃中的部分？

回答

相关问题