从Adding a single character to add keys in Counter,@AshwiniChaudhary给了一个很好的答案,创建一个新的Counter
对象有不同的set()函数:创建具有特殊设定功能自定义计数器对象
from collections import Counter
class CustomCounter(Counter):
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(u"\uE000"):
key += u"\uE000"
super(CustomCounter, self).__setitem__(key, value)
要允许用户自定义字符/ STR追加到重点,我已经试过:
from collections import Counter, defaultdict
class AppendedStrCounter(Counter):
def __init__(self, str_to_append):
self._appended_str = str_to_append
super(AppendedStrCounter, self).__init__()
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(self._appended_str):
key += self._appended_str
super(AppendedStrCounter, self).__setitem__(tuple(key), value)
但它返回一个空计数器:
>>> class AppendedStrCounter(Counter):
... def __init__(self, str_to_append):
... self._appended_str = str_to_append
... super(AppendedStrCounter, self).__init__()
... def __setitem__(self, key, value):
... if len(key) > 1 and not key.endswith(self._appended_str):
... key += self._appended_str
... super(AppendedStrCounter, self).__setitem__(tuple(key), value)
...
>>> AppendedStrCounter('foo bar bar blah'.split())
AppendedStrCounter()
那是因为我的思念在__init__()
的ITER:
from collections import Counter, defaultdict
class AppendedStrCounter(Counter):
def __init__(self, iter, str_to_append):
self._appended_str = str_to_append
super(AppendedStrCounter, self).__init__(iter)
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(self._appended_str):
key += self._appended_str
super(AppendedStrCounter, self).__setitem__(tuple(key), value)
[出]:
>>> AppendedStrCounter('foo bar bar blah'.split(), u'\ue000')
AppendedStrCounter({('f', 'o', 'o', '\ue000'): 1, ('b', 'a', 'r', '\ue000'): 1, ('b', 'l', 'a', 'h', '\ue000'): 1})
但'bar'
值是错误的,它应该是2,而不是1
正在使用iter
到__init__()
正确的方式来初始化Counter
?
您制作的超类构造函数使用'__setitem__'为它增加了每个项目的假设,但没有保证它必须。https://docs.python.org/2/library/collections.html#collections.Counter只承诺它的缺点tructor会表现,而不是如何实施。 – amalloy
仔细查看@AshwiniChaudhary的参考答案。在他的答案中,“the”键的计数器也是1而不是2 – Felix
更改存储键的方式可能会带来一些令人讨厌的惊喜...例如,没有人可以存储“word”ue000“计数与'CustomCounter'中的''word''分开。另外,他们如何获得特定的词语?用户必须记得每当他们需要cc ['word']'时要求'cc ['word \ ue000']',这完全破坏了封装的OOP目标。 –