Python：如何忽略字符串中的非字母？

该函数打印文件中字母的个别频率，但无法忽略非字母，我只想在计算每个字母的百分比频率时计算字母。这是我到目前为止有：Python：如何忽略字符串中的非字母？

from string import ascii_lowercase as lowercase 

def calcFrequencies(file): 
    """Enter file name in quotations. Shows the frequency of letters in a file""" 
    infile = open(file) 
    text = infile.read() 
    text = text.lower() 

    text_length = len(text) 
    counts = [0]*26 

    for i in range(26): 
     char=lowercase[i] 
     counts[i] = 100*text.count(char)/text_length 
     print("{:.1f}% of the characters are '{}'".format(counts[i],char)) 
    infile.close()

来源

2014-01-30 Foflo

你可以使用join方法与列表理解（比genexp更快）计数之前，仅使字符重新分配字符串：

text = ''.join([char for char in text if char.isalpha()])

来源

2014-01-30 02:28:40 jayelm

Downvoter，我该如何改进答案？ – jayelm

不知道为什么你得到了一个投票，但它适用于我 – Foflo

@Foflo只是建议，如果你关心速度，我相信mhlester的解决方案更快。 – jayelm

使用filter

>>> text = "abcd1234efg" 
>>> filter(str.isalpha, text) 
'abcdefg'

来源

2014-01-30 02:32:00 mhlester

'filter'已弃用，最好避免。使用列表理解或循环代替 – Abhijit

我不知道。你能分享一个链接吗？ – mhlester

@Ahhijit，我还没有找到任何文学的影响。此外，在这种情况下，我的测试过滤器比列表理解速度快43％。 – mhlester

Python：如何忽略字符串中的非字母？

回答

相关问题