2016-11-30 46 views
0

您好我有这样的记录给定的字符串特定长度的字符,提取在Python正则表达式

如:

Health Insurance PortabilityNEG Ratio 
Health Insurance PortabilityNEGRatio 
Health Insurance PortabilityNEG NEGRatio 

在这里,我需要提取PortabilityNEG 我用正则表达式作为

Insurance(.{25}).*? 

但我不想提保险。请让我知道我该如何写正则表达式?

+0

'import re; re.sub(r“(\ w +)\ s(\ w +)\ s(\ w {0,14})([\ w] +)”,“\\ 3”,“健康保险PortabilityNEGRatio”)? – Abdou

回答

0

这就是您如何从所提供的行中提取所有PortabilityNEG术语的方法。

import re 

a=""" 
Health Insurance PortabilityNEG Ratio 
Health Insurance PortabilityNEGRatio 
Health Insurance PortabilityNEG NEGRatio 
""" 
print re.findall('Insurance\s+(PortabilityNEG)',a,re.MULTILINE) 

输出:

['PortabilityNEG', 'PortabilityNEG', 'PortabilityNEG'] 
0

既然你不想提 “保险”,你可以尝试以下方法:

# Set up your test string 
test_string = """Health Insurance PortabilityNEG Ratio 
Health Insurance PortabilityNEGRatio 
Health Insurance PortabilityNEG NEGRatio""" 

# Set your pattern using regular expression groups 
pattern = re.compile("(\w+)\s(\w+)\s(\w{0,14})([\w ]+)") 

# Use re.sub to replace all groups with only the third group 
[pattern.sub('\\3',x) for x in test_string.split("\n")] 

# ['PortabilityNEG', 'PortabilityNEG', 'PortabilityNEG'] 

我希望这有助于。