提取半结构化信息从一个字符串在javascript

-4

“[巴黎：位置]和[里昂：地理位置]在法国”

我需要从它们中提取所有标记的部分（“巴黎：位置”和“里昂：位置”）。

我试过这段代码中使用正则表达式（RegExp）：

var regexEntity = new RegExp('\[.+:.+\]', 'g'); 

var text = '[Paris:location] and [Lyon:location] are in France'; 
while ((match = regexEntity.exec(text))) { 
    console.log(match); 
}

但是，这是我的输出得到，就好像是检测结肠癌：

[ ':', 
    index: 6, 
    input: '[Paris:location] and [Lyon:location] are in France' ] 
[ ':', 
    index: 26, 
    input: '[Paris:location] and [Lyon:location] are in France' ]

是我的正则表达式有什么问题吗？您使用其他方法获取该信息？

来源

2016-09-02 Guido García

首先，如果你打算使用构造函数，你必须使用'VAR regexEntity =新的RegExp（“\\ +：+ \']'，'g'）;'。但是，如果您使用正则表达式文字表示法，则此问题不存在。请注意''\ [。+：。+ \]''=''[。+：。+]''（实际上匹配1个符号 - '.'，'+'或'：'）。然后，'。+'是一个贪婪的子模式，你可以使用懒惰的'+？'。然后，您可以添加捕获组。 –

这就是为什么我避免使用'RegExp'构造函数。每当使用RegExp构造函数构造RegEx时，请在使用前记录正则表达式。 – Tushar

我可以知道为什么这个问题有5个降价？我不知道它，我想避免再次重复同样的错误。谢谢。 –

的.+是贪婪，你将需要使用懒惰版本的它：.+?。

然后，很简单这样的：

var text = '[Paris:location] and [Lyon:location] are in France'; 
console.log(text.match(/\[.+?:.+?\]/g));

来源

2016-09-02 13:19:14

您可以使用非惰性搜索和正向预测的正则表达式。

var regex = /\[(.*?)(?=:location)/gi, 
 
    string = '"[Paris:location] and [Lyon:location] are in France"', 
 
    match; 
 
    
 
while ((match = regex.exec(string)) !== null) { 
 
    console.log(match[1]); 
 
}

来源

2016-09-02 13:17:47

提取半结构化信息从一个字符串在javascript

回答

相关问题