是什么正则表达式的意思是，为什么

sed "s/\(^[a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\)/\1\2 \1/g" desired_file_name

我apreciate它，即使你只以免用言语结构也解释了它或部分在s\alphanumerical_at_start\something\alphanumerical_at_end\something_else\global是什么正则表达式的意思是，为什么

有人能解释一下这是什么意思，为什么，是所有regEx如此...可怕？

我知道它会用最后一个替换第一个小写字母数字字。但是，你能解释一下这里发生了什么事吗？所有/\和$.*$\以及其他所有内容都是什么？

我只是迷路了。

编辑：以下是我所得到的：(^[a-z0-9]*)开始于低谷z和0低谷9;和[a-z,0-9]*$是相同的，但最后一个字（但[0-9,a-z] =只是前2个字符/第一个字符，或整个字？）。另外：*或$.*$\甚至意味着什么？

来源

2012-06-05 Kalec

*“为什么和所有regEx都这么...糟糕”*不能成为您的严重问题。事情并不会因为他们逃避理解而自动变得糟糕。 – Tomalak

Lol @“你的理解水平”，没有ofc不是（我在这里学习毕竟），他们是可怕的，因为......看着它。这是100％排斥。举例来说，我认为python代码看起来很神奇（因为它很容易阅读），而C++则不太吸引人（虽然没那么糟糕）。另一方面，这耸耸肩*。但如果你没有有用的评论，请不要评论在第一位，谢谢:) – Kalec

我不觉得正则表达式令人厌恶。我发现它们优雅而美丽。 –

这是一个sed搜索和替换，其格式为s/search/replace/flags，唯一的标志是g，这意味着搜索/替换是全局的，所以如果匹配在一行而不是仅在第一行发生多次。

首先，这里是它搜索正则表达式：

\(^[a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\)

或者更可读的格式：

\(   # start capture group 1 
^   # match at the beginning of the line 
    [a-z,0-9]*  # zero or more alphanumeric or comma characters (lowercase only) 
\)    # end capture group 1 
\(   # start capture group 2 
    .*    # zero or more of any character (except for newlines) 
\)    # end capture group 2 
\(   # start capture group 3 
    [ ]   # literal ' ' character (I added brackets for clarity) 
    [a-z,0-9]*  # zero or more alphanumeric or comma characters (lowercase only) 
    $    # match at the end of the line 
\)    # end capture group 3

这里被替换：

\1\2 \1

这将取代整条线（因为^和$锚在正则表达式）与捕获组1的内容，然后是捕获组2的内容，然后是空格，然后是捕获组1的内容。

来源

2012-06-05 16:59:07

你能解释为什么我们需要避开括号？我知道我们需要，但我不知道为什么，所以我问。 – nhahtdh

[]为''提供了极大的帮助。我只需要澄清一些问题：[0-9，a-z]代表一个符合这些标准的整个单词，直到出现空白为止？ – Kalec

[0-9，a-z]仅匹配单个字符。需要[0-9，a-z] *为任意数量的字符，直到空格 – solidau

（^ [AZ，0-9]） - （。） - 任意字符（第2组）
（即[az，0在一行的开始（第1组）
字母数字或逗号-9] * $） - 一个空格，后跟0或更多的字母数字或逗号[猜逗号只是一个错误]，到行尾
\ 1 \ 2 \ 1 - 替换为（group 1）（组2）空间（组1）
g - 在输入中无处不在

来源

2012-06-05 16:52:00

\ 1 \ 2 \ 1我还是不明白。用什么替换什么。和'\（。* \）\'对于我来说是最困惑的 – Kalec

'\ 1' - 在正则表达式中是一个反向捕获（group）：/（w +）0 \ 1 /是一个每个单词，它有这个模式'part0part' – gaussblurinc

@Alexander：有几个反斜杠和星号没有出现在问题中，因为OP没有使用StackOverflow出色的代码格式功能。再次检查，并请自己使用代码格式。 –

正则表达式是一种描述常规语法的方法。他们以非常简洁和高效的方式完成此任务。这使他们看起来很复杂。

它们也是结构化的和可解码的。

首先，有一个sed调用。

sed "{operation}/{expression}/{replacement}/{modifiers}" {argument}

注

是sed与斜线的部分分离出来。这意味着您不能在{expression}或{replacement}中有未转义的正斜杠。
与其他大多数正则表达式小号不同，sed使用括号来匹配实际的括号，并使用转义括号来定义捕获组。

{operation}恰好是s - 替代。

的{expression}是$^[a-z,0-9]$$.*$$[a-z,0-9]*$$，其分解为

 
\(   # start capture group 1 
^   # match the start of the string 
    [a-z,0-9] # match characters a-z and 0-9 and a comma (!) 
\)    # end capture group 1 
\(   # start capture group 2 
    .*   # match any character (.), zero or more times (*) 
\)    # end capture group 2 
\(   # start capture group 3 
       # match a space 
    [a-z,0-9]* # match characters a-z and 0-9 and a comma (!) 
    $   # match the end of the string 
\)    # end capture group 3

试想想了一秒钟，它会是多少码（和时间）带你来写，做同样的功能，以及如何小空间正则表达式需要。这就是为什么它很难阅读 - 这是非常压缩。

{replacement}是\1\2 \1。 \n被称为回参考，其中n是捕获组的数目。因此，这再次插入组1和组2的内容，组1的内容。

{modifiers}部分是g标志，这使得正则表达式应用尽可能经常。在这种特殊情况下，由于上面的正则表达式只能匹配一次，所以没什么意义。

来源

2012-06-05 17:01:46 Tomalak

如果'operation'是替换之外的东西，语法将会不同，并且不会有'expression'和'replacement'。所以我会开始用'/表达式}/{替换}/{修饰符}'来解释。无论如何，很好的答案。 –

@Lev：没错，我只是想尽可能地分开各个部分。 – Tomalak

有点回归，但为了证明Tomalak提出的关于代码和时间来编写正则表达式可以实现的功能的观点，这个问题（http://uva.onlinejudge.org/external/100/10058.html）可以可以在正则表达式的帮助下在（clean）Java代码的20个LOC内解决，但是在纯C中需要更多的时间和代码。 – nhahtdh

s/\(^[a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\)/\1\2 \1/g 

s -> substitute 
/-> begin of regex 
\(-> begin of a first field(accessed as \1 later) 
^ -> from the begining of line in data 
[a-z,0-9] -> list of characters which will be compared, lowercase a through z, comma, and 0 through 9 
* -> zero or more times 
\) -> end of \1 field 
\(-> begin of \2 
.* -> . means any character. .* means any character zero or more times 
\) -> end of \2 
\([a-z,0-9]*$ -> begin of \3, followed by a space, follwed by zero or more a-z, comma, 0-9 
\) -> end of \3 field 
/-> end of regex to replace 

/-> begin of regex to replace with 
\1\2 \1 -> first field followed by second field followed by a space and again the first field 
/-> end of regex to replace with 

g -> globally

来源

2012-06-05 17:03:22 Anil

是什么正则表达式的意思是，为什么

回答

相关问题