2013-09-24 111 views
2

比方说,我有一系列的不完全合格的路径,这是缺少某些部分,但保证有两个属性:匹配不完全限定路径

  1. 双方的不完全和完全合格的路径的最后一部分会完全相等,并且
  2. 不完全合格的路径的各部分的顺序将匹配完全合格的路径的各部分的实际顺序。

例如,

p1 = '/foo/baz/myfile.txt' 
p2 = '/bar/foo/myfile.txt' 
actual = '/foo/bar/baz/myfile.txt' 

在这种情况下,p1将匹配,但p2不会,因为在实际的路径,barfoo后发生。很容易:[actual.split('/').index(part) for part in p1.split('/')]将是一个有序列表,但同样理解但p2不会。

但是,如果有路径重复发生了什么?

p1 = '/foo/bar/bar/myfile.txt' 
p2 = '/bar/bar/baz/myfile.txt' 
actual = '/foo/bar/baz/bar/myfile.txt' 

我怎么能确定p1不匹配,但p2没有(因为,虽然第一bar后发生baz,它并不之后第二个出现

回答

1
def match(path, actual): 
    path = path.strip('/').split('/') 
    actual = iter(actual.strip('/').split('/')) 
    for pathitem in path: 
     for item in actual: 
      if pathitem == item: 
       break 
     else: 
      # The for-loop never breaked, so pathitem was never found 
      return False 
    return True 

q1 = '/foo/baz/myfile.txt' 
q2 = '/bar/foo/myfile.txt' 
p1 = '/foo/bar/bar/myfile.txt' 
p2 = '/bar/bar/baz/myfile.txt' 
actual = '/foo/bar/baz/bar/myfile.txt' 

print(match(q1, actual)) 
# True 

print(match(q2, actual)) 
# False 

print(match(p1, actual)) 
# True 

print(match(p2, actual)) 
# False 
+0

我发现这种方法简单。为了我的目的,我不得不重新使用它(迭代大量候选'实际'),但它工作。 – 2rs2ts

1

方法1:使用list.index

def match(strs, actual): 
    seen = {} 
    act = actual.split('/') 
    for x in strs.split('/'): 
     if x in seen: 
      #if the item was already seen, so start search 
      #after the previous matched index 
      ind = act.index(x, seen[x]+1) 
      yield ind 
      seen[x] = ind 
     else: 
      ind = act.index(x) 
      yield ind 
      seen[x] = ind 
...    
>>> p1 = '/foo/baz/myfile.txt' 
>>> p2 = '/bar/foo/myfile.txt' 
>>> actual = '/foo/bar/baz/myfile.txt' 
>>> list(match(p1, actual))  #ordered list, so matched 
[0, 1, 3, 4] 
>>> list(match(p2, actual))  #unordered list, not matched 
[0, 2, 1, 4] 

>>> p1 = '/foo/bar/bar/myfile.txt' 
>>> p2 = '/bar/bar/baz/myfile.txt' 
>>> actual = '/foo/bar/baz/bar/myfile.txt' 
>>> list(match(p1, actual))  #ordered list, so matched 
[0, 1, 2, 4, 5] 
>>> list(match(p2, actual))  #unordered list, not matched 
[0, 2, 4, 3, 5] 

方法2:使用defaultdictdeque

from collections import defaultdict, deque 
def match(strs, actual): 
    indexes_act = defaultdict(deque) 
    for i, k in enumerate(actual.split('/')): 
     indexes_act[k].append(i) 
    prev = float('-inf') 
    for item in strs.split('/'): 
     ind = indexes_act[item][0] 
     indexes_act[item].popleft() 
     if ind > prev: 
      yield ind 
     else: 
      raise ValueError("Invalid string") 
     prev = ind 

演示:

>>> p1 = '/foo/baz/myfile.txt' 
>>> p2 = '/bar/foo/myfile.txt' 
>>> actual = '/foo/bar/baz/myfile.txt' 
>>> list(match(p1, actual)) 
[0, 1, 3, 4] 
>>> list(match(p2, actual)) 
    ... 
    raise ValueError("Invalid string") 
ValueError: Invalid string 

>>> p1 = '/foo/bar/bar/myfile.txt' 
>>> p2 = '/bar/bar/baz/myfile.txt' 
>>> actual = '/foo/bar/baz/bar/myfile.txt' 
>>> list(match(p1, actual)) 
[0, 1, 2, 4, 5] 
>>> list(match(p2, actual)) 
    ... 
    raise ValueError("Invalid string") 
ValueError: Invalid string 
+0

感谢:我不知道我能传递给'的.index偏移()' 。 – 2rs2ts