我想从字符串中删除网址,并将它们替换为原始内容的标题。Python:用字符串中的标题名称替换网址
例如:
mystring = "Ah I like this site: http://www.stackoverflow.com. Also I must say I like http://www.digg.com"
sanitize(mystring) # it becomes "Ah I like this site: Stack Overflow. Also I must say I like Digg - The Latest News Headlines, Videos and Images"
对于标题取代网址,我写了这个snipplet:
#get_title: string -> string
def get_title(url):
"""Returns the title of the input URL"""
output = BeautifulSoup.BeautifulSoup(urllib.urlopen(url))
return output.title.string
不知何故,我需要这个功能适用于字符串在那里抓到的网址和转换通过get_title标题。
url = re.compile("http:\/\/(.*?)/")
text = url.sub(get_title, text)
的:
和你的问题是什么? – msw 2010-05-08 17:28:25
我已经更新了这个问题,对不起:) – Hellnar 2010-05-08 17:30:06