2017-04-04 31 views
1

Adding a single character to add keys in Counter,@AshwiniChaudhary给了一个很好的答案,创建一个新的Counter对象有不同的set()函数:创建具有特殊设定功能自定义计数器对象

from collections import Counter 

class CustomCounter(Counter): 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(u"\uE000"): 
      key += u"\uE000" 
     super(CustomCounter, self).__setitem__(key, value) 

要允许用户自定义字符/ STR追加到重点,我已经试过:

from collections import Counter, defaultdict 

class AppendedStrCounter(Counter): 
    def __init__(self, str_to_append): 
     self._appended_str = str_to_append 
     super(AppendedStrCounter, self).__init__() 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(self._appended_str): 
      key += self._appended_str 
     super(AppendedStrCounter, self).__setitem__(tuple(key), value) 

但它返回一个空计数器:

>>> class AppendedStrCounter(Counter): 
...  def __init__(self, str_to_append): 
...   self._appended_str = str_to_append 
...   super(AppendedStrCounter, self).__init__() 
...  def __setitem__(self, key, value): 
...   if len(key) > 1 and not key.endswith(self._appended_str): 
...    key += self._appended_str 
...   super(AppendedStrCounter, self).__setitem__(tuple(key), value) 
... 
>>> AppendedStrCounter('foo bar bar blah'.split()) 
AppendedStrCounter() 

那是因为我的思念在__init__()的ITER:

from collections import Counter, defaultdict 

class AppendedStrCounter(Counter): 
    def __init__(self, iter, str_to_append): 
     self._appended_str = str_to_append 
     super(AppendedStrCounter, self).__init__(iter) 
    def __setitem__(self, key, value): 
     if len(key) > 1 and not key.endswith(self._appended_str): 
      key += self._appended_str 
     super(AppendedStrCounter, self).__setitem__(tuple(key), value) 

[出]:

>>> AppendedStrCounter('foo bar bar blah'.split(), u'\ue000') 
AppendedStrCounter({('f', 'o', 'o', '\ue000'): 1, ('b', 'a', 'r', '\ue000'): 1, ('b', 'l', 'a', 'h', '\ue000'): 1}) 

'bar'值是错误的,它应该是2,而不是1

正在使用iter__init__()正确的方式来初始化Counter

+2

您制作的超类构造函数使用'__setitem__'为它增加了每个项目的假设,但没有保证它必须。https://docs.python.org/2/library/collections.html#collections.Counter只承诺它的缺点tructor会表现,而不是如何实施。 – amalloy

+2

仔细查看@AshwiniChaudhary的参考答案。在他的答案中,“the”键的计数器也是1而不是2 – Felix

+0

更改存储键的方式可能会带来一些令人讨厌的惊喜...例如,没有人可以存储“word”ue000“计数与'CustomCounter'中的''word''分开。另外,他们如何获得特定的词语?用户必须记得每当他们需要cc ['word']'时要求'cc ['word \ ue000']',这完全破坏了封装的OOP目标。 –

回答

1

正如 Felix's comment指出, collections.Counter 不会记录__init__方法如何增加键或设置值,只是它的作用。 由于它没有明确的子类化设计,最明智的做法是而不是的子类。

collections.abc 模块的存在是为了提供易于子类的抽象类Python的内建类型,包括dictMutableMapping,在ABC术语)。 所以,如果你需要的是“一个Counter状类” (而不是“,将满足喜欢isinstanceissubclass建宏Counter一个子类), 您可以创建自己的MutableMapping有-一个Counter,然后“中间人”初始化和三种方法Counter增加了典型dict

import collections 
import collections.abc 


def _identity(s): 
    ''' 
    Default mutator function. 
    ''' 
    return s 


class CustomCounter(collections.abc.MutableMapping): 
    ''' 
    Overrides the 5 methods of a MutableMapping: 
    __getitem__, __setitem__, __delitem__, __iter__, __len__ 

    ...and the 3 non-Mapping methods of Counter: 
    elements, most_common, subtract 
    ''' 

    def __init__(self, values=None, *, mutator=_identity): 
     self._mutator = mutator 
     if values is None: 
      self._counter = collections.Counter() 
     else: 
      values = (self._mutator(v) for v in values) 
      self._counter = collections.Counter(values) 
     return 

    def __getitem__(self, item): 
     return self._counter[self._mutator(item)] 

    def __setitem__(self, item, value): 
     self._counter[self._mutator(item)] = value 
     return 

    def __delitem__(self, item): 
     del self._counter[self._mutator(item)] 
     return 

    def __iter__(self): 
     return iter(self._counter) 

    def __len__(self): 
     return len(self._counter) 

    def __repr__(self): 
     return ''.join([ 
      self.__class__.__name__, 
      '(', 
      repr(dict(self._counter)), 
      ')' 
      ]) 

    def elements(self): 
     return self._counter.elements() 

    def most_common(self, n): 
     return self._counter.most_common(n) 

    def subtract(self, values): 
     if isinstance(values, collections.abc.Mapping): 
      values = {self._mutator(k): v for k, v in values.items()} 
      return self._counter.subtract(values) 
     else: 
      values = (self._mutator(v) for v in values) 
      return self._counter.subtract(values) 


def main(): 
    def mutator(s): 
     # Asterisks are easier to print than '\ue000'. 
     return '*' + s + '*' 

    words = 'the lazy fox jumps over the brown dog'.split() 

    # Test None (allowed by collections.Counter). 
    ctr_none = CustomCounter(None) 
    assert 0 == len(ctr_none) 

    # Test typical dict and collections.Counter methods. 
    ctr = CustomCounter(words, mutator=mutator) 
    print(ctr) 
    assert 1 == ctr['dog'] 
    assert 2 == ctr['the'] 
    assert 7 == len(ctr) 
    del(ctr['lazy']) 
    assert 6 == len(ctr) 
    ctr.subtract(['jumps', 'dog']) 
    assert 0 == ctr['dog'] 
    assert 6 == len(ctr) 
    ctr.subtract({'the': 5, 'bogus': 100}) 
    assert -3 == ctr['the'] 
    assert -100 == ctr['bogus'] 
    assert 7 == len(ctr) 
    return 


if "__main__" == __name__: 
    main() 

输出(线包裹,为了便于阅读):

CustomCounter({ 
    '*brown*': 1, 
    '*lazy*': 1, 
    '*the*': 2, 
    '*over*': 1, 
    '*jumps*': 1, 
    '*fox*': 1, 
    '*dog*': 1 
    }) 

我为初始化程序mutator添加了一个关键字参数,用于存储将真实世界的发起者转换为“突变”计数版本的函数。 请注意,这可能意味着CustomCounter不再存储“可哈希对象”,而是“不能生成增变器barf的可哈希对象”。

此外,如果标准库的Counter有新的方法,您必须更新CustomCounter以“覆盖”它们。 (也许你可以解决,通过使用 __getattr__ 到任何未知属性传递给self._counter,但在参数中的任何钥匙将其原料交给了Counter,“非突变”的形式。

最后,正如我前面提到的,它不是实际上collections.Counter一个子类,如果其他代码是专找一个。