正则表达式：从URL

获取的内容我想 “the-game” 使用正则表达式的网址，像什么位于之间正则表达式：从URL

http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/

2010-04-22 FarazShuja

您使用哪种语言？ – 2010-04-22 20:04:37

想在这里使用http://www.movabletype.org/documentation/appendices/modifiers/regex-replace.html – FarazShuja 2010-04-22 20:22:57

var myregexp = /^(?:[^\/]*\/){4}([^\/]+)/; 
var match = myregexp.exec(subject); 
if (match != null) { 
    result = match[1]; 
} else { 
    result = ""; 
}

比赛第四和第五斜线并将结果存储在变量中result。

来源

2010-04-22 20:58:33

可爱......我在想，但我没有把它写成答案 – dlamotte 2010-04-22 22:23:03

从左边我只是寻找第四和第五斜线（/）之间的任何文字。 – FarazShuja 2010-04-23 05:25:49

啊，你在更新中击败我！惊人的多少有点澄清的要求:) – BenV 2010-04-23 14:22:32

URL的哪些部分可能会有所不同，哪些部分是固定的？以下正则表达式将总是与示例中的“/ en /” - the-game后面的斜线匹配。

(?<=/en/).*?(?=/)

这一个将匹配第二组包含“Webdev的”任何URL的斜线的内容，假设第一组斜线包含2或3字符的语言代码。

(?<=.*?webdev.*?/.{2,3}/).*?(?=/)

希望你可以调整这些例子来完成你正在寻找的东西。

来源

2010-04-22 22:01:54 BenV

从左侧读我只是寻找第4和第5斜线（/）之间的任何文本。 – FarazShuja 2010-04-23 05:24:44

你可能应该使用某种URL解析库，而不是诉诸使用正则表达式。

在蟒蛇：

from urlparse import urlparse 
url = urlparse('http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/') 
print url.path

这将产生：

/en/the-game/another-one/another-one/another-one/

从那里，你可以做简单的事情，就像从路径的开始剥离/en/。否则，你一定会犯一个正则表达式错误的东西。不要重新发明轮子！

来源

2010-04-22 22:27:54 dlamotte

正则表达式：从URL

回答

相关问题