2016-04-26 179 views
2

为了找到子串的位置,在一个字符串中,一个朴素的算法将花费O(n^2)时间。然而,使用一些高效的算法(如KMP algorithm),这可以在O(n)的时间来实现的:python str.index时间复杂度

s = 'saurabh' 
w = 'au' 

def get_table(): 
    i = 0; j = 2 
    t = [] 
    t.append(-1); t.append(0) 
    while i < len(w): 
     if w[i] == w[j-1]: 
      t.append(j+1) 
      j += 1 
     else: 
      t.append(0) 
      j = 0 
     i += 1 
    return t 

def get_word(): 
    t = get_table() 
    i = j = 0 
    while i+j < len(s): 
     if w[j] == s[i+j]: 
      if j == len(w) - 1: 
       return i 
      j += 1 
     else: 
      if t[j] > -1: 
       i = i + j - t[j] 
       j = t[j] 
      else: 
       i += 1 
    return -1 

if __name__ == '__main__': 
    print get_word() 

但是,如果我们这样做:'saurabh'.index('ra'),它内部使用了一些有效的算法计算这O(n)或它使用复杂度为O(n^2)的朴素算法?

+0

你可以配置文件,看看是否时间呈指数或线性增长; ) –

回答

2

在那篇文章[1]笔者穿过algoritm和解释它。从文章:

The function “fastsearch” is called. It is a mix between 
Boyer-Moore and Horspool algorithms plus couple of neat tricks. 

而且从博耶 - 穆尔 - Horspool算法[2]的wiki页面:

The algorithm trades space for time in order to obtain an 
average-case complexity of O(N) on random text, although 
it has O(MN) in the worst case, where the length of the 
pattern is M and the length of the search string is N. 

希望帮助!

[1] http://www.laurentluce.com/posts/python-string-objects-implementation

[2] https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm

+0

但是KMP的最坏情况时间仍然是线性的。这是否意味着我们应该使用KMP算法代替python的内建索引()来实现我们的代码,以用于时间关键型流程? –

+0

我认为那个主题对这个主题有一个很好的答案:http://programmers.stackexchange.com/questions/183725/which-string-search-algorithm-is-actually-the-fastest – alpert

1

有时你可以通过努力得到了迅速的回答

>>> timeit.timeit('x.index("ra")', setup='x="a"*100+"ra"') 
0.4658635418727499 
>>> timeit.timeit('x.index("ra")', setup='x="a"*200+"ra"') 
0.7199222409243475 
>>> timeit.timeit('x.index("ra")', setup='x="a"*300+"ra"') 
0.9555441829046458 
>>> timeit.timeit('x.index("ra")', setup='x="a"*400+"ra"') 
1.1994099491303132 
>>> timeit.timeit('x.index("ra")', setup='x="a"*500+"ra"') 
1.4850994427915793