python查找字符串中的子串

我想查找python中字符串中子字符串的出现次数。但我需要我的搜索是非常具体的。之前搜索的字符串我删除所有的标点：python查找字符串中的子串

myString.translate（无，string.punctuation）

现在我搜索的子字符串。如果我正在搜索子字符串“hello bob”，并在字符串内部搜索，我将文本“hello bob-something else”或“hello bob'”以及其他一些文本一起。当我删除标点符号时，两个字符'不会被删除，因为它们不是unicode字符，因此上面提到的两个字符串不应该被视为“hello bob”这个词的出现。

我用下面的正则表达式的代码来尝试获得事件的正确数量，在大型文件（3000线以上），我开始没有得到的话

counter = 0 
searcher = re.compile("hello bob" + r'([^\w-]|$)').search 
with open(myFile, 'r') as source: 
    for line in source: 
     if searcher(line): 
      counter += 1

别的东西出现的正确数量我试过

我想使用findAll函数，因为到目前为止，它给了我输入的单词的正确数目。

我发现这对计算器：

re.findall(r'\bword\b', read)

反正是有，我可以使用一个变量，而不是词的？

比如我想使用：

myPhrase = "hello bob" 
re.findall(r'\bmyPhrase\b', read)

这应该是一样的：

re.findall(r'\bhello bob\b', read)

来源

2017-02-13 memoryManagers

给出一个示例输入和期望输出。 –

查找关于re.findAll（） – TallChuck

@ juanpa.arrivillaga的信息这将是非常困难的，因为上面的代码在大多数情况下工作，但在大文本（3000行或更多）的texfiles上失败 – memoryManagers

您可以执行字符串中使用下面的技巧来解决这个问题插值。

myphrase = "hello bob" 
pattern = r'\b{var}\b'.format(var = myphrase)

来源

2017-02-13 04:42:50 Prerit

这完美无缺地感谢 – memoryManagers

@memoryManagers不客气！：d – Prerit

您可以使用re.escape(myPhrase)进行变量替换。

read = "hello bob ! how are you?" 
myPhrase = "hello bob" 
my_regex = r"\b" + re.escape(myPhrase) + r"\b" 

counter = 0 
if re.search(my_regex, read, re.IGNORECASE): 
    counter += 1 
else: 
    print "not found"

来源

2017-02-13 04:49:36

python查找字符串中的子串

回答

相关问题