import re
reg = re.compile('\s*(\D+?)\s*(\d+)'
'[,;:.#[email protected]\s]*'
'(\D+?)\s*(\d+)'
'\s*')
for s in ('Miami 0, New England 28',
'Miami0,New England28 ',
' Miami 0 . New England28',
'Miami 0 ; New England 28',
'Miami0#New England28 ',
' Miami 0 @ New England28'):
print reg.search(s).groups()
结果
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
'\D'
意味着 '无位'
什么的代码位?你没有在那里向我们展示过任何代码。你能发布你用来解析文本的代码吗? – mgilson 2013-02-14 22:38:30
是的正则表达式非常快。 – eyquem 2013-02-14 23:25:18