2016-03-25 168 views
3

我是Python正则表达式的初学者。我达到了我所需要的,但是由于我缺少经验,这真的很难看。我的目标是形式的字符串数组转换:Python正则表达式字符串数组浮点数组

notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"] 

到彩车的行列,从而使上述阵列产量:

changes = [10.0,-5.0,0,-21.2] 

下面的代码实现了,但确实是重复的,不好的风格。我该如何优化?

changes = [] 
for note in notes: 
    m = re.search(r"(?:(\d+\.\d+\%\shigher)|(\d+\.\d+\%\slower)|(Same\sas))", note) 
    if m: 
     if m.groups(0): 
      if m.groups(0)[0]: 
       changes += [float(re.match(r"(\d+\.\d+)", m.groups(0)[0]).groups(0)[0])] 
      elif m.groups(0)[1]: 
       changes += [-float(re.match(r"(\d+\.\d+)", m.groups(0)[1]).groups(0)[0])] 
      else: 
       changes += [0.0] 
print changes 
+1

你真的应该在CodeReview.SE上发布这个...另外,你可以改变这两个if语句,如果m:如果m.groups(0):到一个,如果m和m.groups (0):' – Druzion

回答

1

使用findall你可以d Ø这在一个单一的正则表达式:

notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"] 

changes = [] 
for note in notes: 
    m = re.findall("(?:(\d+\.\d+)%)?(higher|lower|Same as)", note) 
    if len(m): 
     if m[0][1] == 'higher': 
      changes += [float(m[0][0])] 
     elif m[0][1] == 'lower': 
      changes += [-float(m[0][0])] 
     else: 
      changes += [0.0] 

print changes 
+1

这是我最容易理解的解决方案 – niklas

1
import re 

def get_val(s): 
    if "higher" in s: 
     return float(re.sub("\D", "", s)) 
    if "lower" in s: 
     return -float(re.sub("\D", "", s)) 
    return 0 

notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"]  
changes = [get_val(s) for s in notes] 
print(changes) 

打印

[100.0, -50.0, 0, -212.0] 

很多比正则表达式快(相关大投入,为小巧的投入没有那么多)将string.translate

import string 

all_chars = string.maketrans('', '') 
no_digits = all_chars.translate(all_chars, string.digits) 

def get_val(s): 
    if "higher" in s: 
     return float(s.translate(all_chars, no_digits)) 
    if "lower" in s: 
     return -float(s.translate(all_chars, no_digits)) 
    return 0 

notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"] 
changes = [get_val(s) for s in notes] 
print(changes) 
1
  • 你可以把模式变量,并在视觉上分割组
  • 可以匹配浮动字符串模式,并将其直接转换为
  • 您可以使用or选择匹配组

实施例:

import re 


notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"] 

pattern = '(?:' \ 
    '((\d+\.\d+)\%\shigher)|' \ 
    '((\d+\.\d+)\%\slower)|' \ 
    '(Same\sas)' \ 
')' 

changes = [] 

for note in notes: 
    gr = re.search(pattern, note).groups() 
    num = float(gr[1] or gr[3] or 0) * (-1 if gr[3] else 1) 
    changes.append(num) 

print(changes) # [10.0, -5.0, 0.0, -21.2] 
0
#! python3 

notes = ["10.0% higher", "5.0% lower", "Same as", "21.2% lower"] 

def adjustments(notes): 
    for n in notes: 
     direction = -1.0 if n.endswith('lower') else 1.0 
     offset = 0.0 if n.lower() == 'same as' else float(n.split('%')[0]) 
     yield offset * direction 

changes = [x for x in adjustments(notes)] 
print(changes) 
相关问题