正则表达式检索字符串

正则表达式的最后几个字符检索字符串的最后部分：正则表达式检索字符串

https://play.google.com/store/apps/details?id=com.lima.doodlejump

我期待检索字符串，然后id=

下面的正则表达式似乎没有在Python SAMPLEURL = “https://play.google.com/store/apps/details?id=com.lima.doodlejump” 工作

re.search("id=(.*?)", sampleURL).group(1)

的AB Ove应该给我一个输出：

com.lima.doodlejump

我的搜索组是否正确？

来源

2013-11-27 Siddharthan Asokan

我很惊讶，没有人提到urlparse尚未...

>>> s = "https://play.google.com/store/apps/details?id=com.lima.doodlejump" 
>>> urlparse.urlparse(s) 
ParseResult(scheme='https', netloc='play.google.com', path='/store/apps/details', params='', query='id=com.lima.doodlejump', fragment='') 
>>> urlparse.parse_qs(urlparse.urlparse(s).query) 
{'id': ['com.lima.doodlejump']} 
>>> urlparse.parse_qs(urlparse.urlparse(s).query)['id'] 
['com.lima.doodlejump'] 
>>> urlparse.parse_qs(urlparse.urlparse(s).query)['id'][0] 
'com.lima.doodlejump'

这里巨大的好处是，如果URL查询字符串中得到更多的组件，然后它可以轻松突破依赖于一个简单的str.split其他的解决方案。但不会混淆urlparse :)。

来源

2013-11-27 01:11:31 mgilson

+1这是做这件事的正确方法，这必须被接受。 – thefourtheye

@thefourtheye - 我以为是这样:)。 – mgilson

你的正则表达式

(.*?)

不会工作，因为，它会零和无限的时间之间的匹配，如几次尽可能（becasue的?的）。所以，你有正则表达式

(.*)  # Matches the rest of the string 
(.*?)$ # Matches till the end of the string

以下选择，但，你并不需要在正则表达式都在这里，只需split这样

data = "https://play.google.com/store/apps/details?id=com.lima.doodlejump" 
print data.split("id=", 1)[-1]

输出

com.lima.doodlejump

字符串

如果你真的要使用正则表达式，你可以这样做

data = "https://play.google.com/store/apps/details?id=com.lima.doodlejump" 
import re 
print re.search("id=(.*)", data).group(1)

输出

com.lima.doodlejump

来源

2013-11-27 00:59:09 thefourtheye

我会使用'data.rsplit（“id =”，1）[ - 1]'除非我已经确认'id ='在字符串中。那么，或者根据我想要多次出现'id ='出现的行为来分割''。但OP应该考虑所需的行为 - 使用[-1]避免IndexError（如果不存在）是重要的。 –

@PeterDeGlopper Ya。这样更好。更新它:)谢谢:) – thefourtheye

只是把它分解的地方，你想：

id = url.split('id=')[1]

如果打印id，你会得到：

com.lima.doodlejump

正则表达式ISN”牛逼需要在这里:)

然而，如果有多个id=在你的字符串，你只是想最后一个：

id = url.split('id=')[-1]

希望这有助于！

来源

2013-11-27 00:59:21 aIKid

这工作：

>>> import re 
>>> sampleURL = "https://play.google.com/store/apps/details?id=com.lima.doodlejump" 
>>> re.search("id=(.+)", sampleURL).group(1) 
'com.lima.doodlejump' 
>>>

而不是捕获非贪婪零个或多个字符，该代码贪婪地捕捉为一个或多个。

来源

2013-11-27 01:02:25 iCodez

正则表达式检索字符串

回答

相关问题