我正在研究一个JavaScript代码,该代码将RegExp分解为其基本组件,并对其做了什么小小的说明。根据另一个数组中的条目拆分数组
我的一般想法是将输入字符串(作为RegExp)拆分为另一个数组的条目。
我当前的代码:
function interpret(regex){
var r = regex + "";
r = r.split("/");
body = r[1];
flags = r[2];
var classes = [".","\w","\d","\s","\W","\D","\S","[","]"];
var classdefs = ["any non-newline character","any word (digit or letter)","any digit (characters 0-9)","any whitespace character","any non-word (non-digit and non-letter)","any non-digit (not characters 0-9)","open matchset","close matchset"];
var quantifiers = ["*","+","?",
/{(\d+)}/g, // a{n}
/{(\d+),}/g, // a{n,}
/{(\d+),(\d+)}/g, // a{n,m}
/[+*?]\?/g // a<quant>? - lazy quantification
];
var quantDefs = ["repeated 0 or more times","repeated 1 or more times","repeated once or not at all","repeated exactly $1 time","repeated $1 or more times","repeated between $1 and $2 times"];
var escaped = ["\t","\n","\r","\.","\*","\\","\^","\?","\|"];
var escapedDefs = ["a tab","a linefeed","a carriage return","a period","an asterisk","a backslash","a carot","a question mark","a vertical bar"];
// code to split r based on entries in classes, quantifiers, and escaped.
}
理想的情况下,该功能(允许调用它splitR
)将返回输出是这样的:
> splitR("hello",["he","l"]);
["he", "l", "l", "o"]
> splitR("hello",["he"]);
["he", "llo"]
> splitR("hello",["he","o"]);
["he", "ll", "o"];
> splitR("5 is the square root of 25",[/\d+/g,/\w{3,}/g,"of"]);
["5", " is ", "the", " ", "square", " ", "root", " ", "of", " ", "25"]
明确的规定,在splitR
函数应该在上下文interpret
函数,获取RegExp并将其分解到其基本组件;例如\d+[0-9]\w*?
应拆分为["\d", "+", "[", "0-9", "]", "\w", "*", "?"]
。这些组件使用各种RegExps(例如/{(\d+)}/g
查找a{n}
)和字符串(例如"."
)在其他阵列中单独定义。
真的,我很难定义为splitR
。任何帮助表示赞赏!
有一点需要指出的是,在你的'classes'数组,像这样的字符串\“\ w”'将变成只是'“w”'。如果你想保留字符串中的反斜杠,那么你需要'“\\ w”'。 – jfriend00
@ jfriend00啊谢谢! –