从字符串获取链接列表

我需要Python中的正则表达式帮助。我有如下字符串：从字符串获取链接列表

17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19 
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1 2013-10-19

我该如何得到这个列表？

['http://example1.com/viewtopic.php?f=8&t=189', 'http://example2.com', 'http://example3.com/threads/example-text-in-url.27304/']

来源

2013-10-19 valera5505

我要去给一个正则表达式的解决方案，因为这是你的要求对于。基本上，你需要做的就是捕获http://和;之间的文本。下面是一个演示：

from re import findall 

mystr = """ 
17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19 
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1 2013-10-19 
""" 

print findall("(http://.+?);", mystr)

输出：

['http://example1.com/viewtopic.php?f=8&t=189', 'http://example2.com', 'http://example3.com/threads/example-text-in-url.27304/']

来源

2013-10-19 18:41:00 iCodez

这里不需要正则表达式，可以使用csv解析器。

假设你的数据在一个名为data.csv文件：

import csv 
reader = csv.reader(open("data.csv"), delimiter=";") 
referers = [line[1] for line in reader]

来源

2013-10-19 18:31:04

只是尝试this。也许是适合您的需要:)

正则表达式

/^(.*;)/gm

字符串

17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19 
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1 2013-10-19

匹配

1. [0-66] `17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;` 
2. [87-129] `17:22:32;http://example2.com;example2.com;` 
3. [151-228] `20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com

来源

2013-10-19 18:34:43

虽然到测试的链接是很方便的，这是一个好主意，把你的正则表达式中有太多。例如，链接到期，如果链接断开，您的答案将不再有用。 – DSM

你完全正确！谢谢..！ –

“{1}”完全是多余的。 – tripleee

从字符串获取链接列表

回答

相关问题