我有一些SRT数据在每个句子中间用\ r和\ n标签作为换行符返回。我如何才能在文本/句子中间找到\ r和\ n标签，而不是其他表示其他换行符的标签。查找某些/特定的换行符而忽略其他

示例源：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather 
at the hair salon, 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters 
and they dye their hair orange.

所需的输出：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather at the hair salon, 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters and they dye their hair orange.

我在正则表达式绝对的废话，但我最好的猜测（无济于事）是像

变种reg =/[\ d \ r] [a-zA-z0-9 \ s +] + [\ r]/

然后再分割（）以去除其中一个值中间的\ r。我相信这甚至没有接近正确的方式，所以... stackoverflow！ :)

来源

2012-10-22 Jason

http://regexpal.com/是你的朋友，以及！ –

如果句子是'有橙色头发的女人'，怎么办？你如何区分'3'后面的换行符与段号后面的换行符（或者其他什么）？我们可以假设每个块本身总是一行（？）数字，然后是单独一行上的00：00：52,080 - > 00：00：56,210位，然后是一行或多行文本（这就是需要删除换行符的位置），然后是空行？ – nnnnnn

确切！这就是为什么它如此棘手。但是，是的，我们可以假设提示线（即“18”）将始终位于其自己的线上，并且时间范围始终位于一条线上。可能在两行上的唯一内容就是文本。这些帮助有用？？？ – Jason

这将匹配换行符你想摆脱的，之前和之后拍摄的性格，把那两个放回原处周围的空间：

var regex = /([a-z,.;:'"])(?:\r\n?|\n)([a-z])/gi; 
str = str.replace(regex, '$1 $2');

有些事情有关规则表达。我使用修饰符i和g使其不区分大小写，并在字符串中查找所有换行符，而不是在第一个换行符后停止。另外，它假设可以发生可移除的换行符后一个字母，逗号，句点，分号，冒号或单引号或双引号和之前的另一封信。正如@nnnnnn在上面的评论中提到的那样，这不会涵盖所有可能的句子，但它至少不应该扼制大多数标点符号。换行符必须是单行换行符，但它是平台无关的（可以是\r,\n或\r\b）。我捕捉换行符前的字符和换行符后面的字母（带圆括号），所以我可以在替换字符串中使用$1和$2来访问它们。这基本上就是这样。

来源

2012-10-22 20:57:05

好的，首先。谢谢。那东西对我来说是严重的巫术。它接近SUPER，但它看起来像是在休息之后杀死人物（关闭一个人）。例如，现在“19”看起来像这样 00：00：52,080 - > 00：00：56,210 所有的母亲和女儿 nd他们染发橙。（看看它是如何失踪的“一”，但仍然有休息？） – Jason

哦，对不起。我稍后添加了一个捕获组，并忘记调整替换。我会在一秒之内编辑答案！ –

此正则表达式应该做的伎俩：

/(\d+\r\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\r)([^\r]+)\r([^\r]+)(\r|$)/g

为了让更多的行这项工作（必须是一组数字），那么只需添加更多([^\r]+)\r的。（记得还要加上$的比赛替换成这样（3线）：'$1$2 $3 $4\r'）。

使用

mystring = mystring.replace(/(\d+\r\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\r)([^\r]+)\r([^\r]+)(\r|$)/g, '$1$2 $3\r');

限制

如果有超过2行文字这是不行的。

实施例1个

工作正常！

输入：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather 
at the hair salon, 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters 
and they dye their hair orange.

输出：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather at the hair salon, 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters and they dye their hair orange

实施例2

不起作用;超过2行

输入：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather 
at the hair salon, 
and they just talk 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters 
and they dye their hair orange. 
Except for Maria who dyes it pink.

输出：

18 
00:00:50,040 --> 00:00:51,890 
All the women gather at the hair salon, 
and they just talk 

19 
00:00:52,080 --> 00:00:56,210 
all the mothers and daughters and they dye their hair orange. 
Except for Maria who dyes it pink.

来源

2012-10-22 21:13:44 h2ooooooo

查找某些/特定的换行符而忽略其他

回答

使用

限制

实施例1个

实施例2

相关问题