在字符串中查找字符串的子序列

我想创建一个函数来检查字符串中是否存在其他字符串。
但是，正在检查的子字符串可能会在主字符串中被其他字母打断。在字符串中查找字符串的子序列

例如：

a = 'abcde' 
b = 'ace' 
c = 'acb'

有问题的函数应该返回为b正在a，但不c。我试过set(a)。已经设置了交集（set（b）），我的问题是它返回c，因为它在a中。

来源

2010-09-09 Anon Not4Chan

这些类型的字符串分别被称为子序列（HTTP：//en.wikipedia。 org/wiki/Subsequence）更长的字符串。 – Lazer 2010-09-11 12:59:13

这个问题是一个特例http://stackoverflow.com/questions/6877249/find-the-number-of-occurrences-of-a-subsequence-in-a-string那里的解决方案更有效地解决这种情况也是如此。 – Amoss 2014-04-06 08:42:43

你可以把你的预期序列为正则表达式：

import re 

def sequence_in(s1, s2): 
    """Does `s1` appear in sequence in `s2`?""" 
    pat = ".*".join(s1) 
    if re.search(pat, s2): 
     return True 
    return False 

# or, more compactly: 
def sequence_in(s1, s2): 
    """Does `s1` appear in sequence in `s2`?""" 
    return bool(re.search(".*".join(s1), s2)) 

a = 'abcde' 
b = 'ace' 
c = 'acb' 

assert sequence_in(b, a) 
assert not sequence_in(c, a)

“王牌”被变成了正则表达式。“一* C * E”，它发现在序列这三个字符，可能介入字符。

来源

2010-09-09 02:56:03

感谢您的及时答复！ – 2010-09-09 03:05:49

如何对这样的事情...

def issubstr(substr, mystr, start_index=0): 
    try: 
     for letter in substr: 
      start_index = mystr.index(letter, start_index) + 1 
     return True 
    except: return False

或...

def issubstr(substr, mystr, start_index=0): 
    for letter in substr: 
     start_index = mystr.find(letter, start_index) + 1 
     if start_index == 0: return False 
    return True

来源

2010-09-09 04:13:29

我预计这会比基于正则表达式的答案运行得更快。你有时间吗？ – 2010-09-09 04:19:03

不是没有时机，只是写它作为替代。 – 2010-09-09 04:24:53

def issubstr(s1, s2): 
    return "".join(x for x in s2 if x in s1) == s1 

>>> issubstr('ace', 'abcde') 
True 

>>> issubstr('acb', 'abcde') 
False

来源

2010-09-09 09:52:31 killown

请说明空白的意见。 – 2010-09-09 15:47:41

问题是要找到子序列，而不是子串 – gizmo 2012-11-09 07:00:25

在字符串中查找字符串的子序列

回答

相关问题