2013-06-20 29 views
0

我想写一个正则表达式将匹配这些条件:正则表达式 - 匹配和提取复杂条件

最大的8000
  • 字符(任何字符,包括“\ r \ n”)
  • 最多10行(用\ r \ n分隔)。
  • 从匹配文本中提取只有前4行

无法找到一个好办法做到这一点...:/

谢谢!

+0

有办法用正则表达式来做到这一点,但他们都不是好** **方式 –

+0

哪种语言/你正在使用的工具?另外,您希望提取前4行 - 如果行数少于4行,会发生什么情况? – ridgerunner

回答

1

正则表达式不是你所需要的。它们用于匹配某个模式,而不是一定的长度。如果您要将数据保存在string中,则需要使用myString.length <= 8000(对于您的语言,使用正确的语法)。对于行数,您必须计算字符串中\r\n序列的数量(可以迭代完成)。要获得前四行,只需找到4th \r\n,然后使用substring方法获取所有内容。

+0

-1错误信息。在“。{0,8000}”这样的正则表达式中使用重复表达式可以让您匹配0到8000个字符。如果你想匹配一个确切数量的字符,那么你可以使用'。{8000}'。但不要听我说,你可以在这里阅读更多关于http://www.regular-expressions.info/repeat.html。 –

1

说明

该表达式执行以下操作:

  • 验证输入字符串是零和字符之间8000
  • 验证有至多10行新行的分隔文本
  • 然后捕获文本的前4行新分界线

\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r\n\Z]+){0,4}这就需要选择:m多,和s点的所有字符

enter image description here

扩展

  • \A锚字符串的开头相匹配,这种定位的允许使用s选择的这允许.匹配新的换行符和换行符
  • (?=.{0,8000}\Z)展望未来并验证介于零和8000个字符
  • (?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)向前看,验证有没有更多然后10个新行分隔行是
  • (?:^.*?[\r\n\Z]+){0,4}比赛第4行文字

PHP代码示例:

没有指定一种语言,所以我将包含这个PHP示例来展示它如何工作和示例输出。

输入文本

该输入测试是8行的新线分隔的字符串。这里只有1779个字符。

Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 
the Italic Mountains, she had a last view back on the skyline of her hometown Bookmarksgrove, the headline of Alphabet Village and the subline of her own road, the Line Lane. Pityful a rethoric question ran over her cheek, then 
she continued her way. On her way she met a copy. The copy warned the Little Blind Text, that where it came from it would have been rewritten a thousand times and everything that was left from its origin would be the word "and" 
and the Little Blind Text should turn around and return to its own, safe country. But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her, made her drunk with Longe 
and Parole and dragged her into their agency, where they abused her for their projects again and again. And if she hasn’t been rewritten, then they are still using her. 

代码

<?php 
$sourcestring="your source string"; 
preg_match('/\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r|\n\Z]+){0,4}/ims',$sourcestring,$matches); 
echo "<pre>".print_r($matches,true); 
?> 

匹配

$matches Array: 
(
    [0] => Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 

)