Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #84129
| References | (1 earlier) <mailman.17884.1421734095.18130.python-list@python.org> <87ppa9944t.fsf@elektro.pacujo.net> <mailman.17906.1421827045.18130.python-list@python.org> <b37db482-1c2d-4b17-9b0b-0e37537482ef@googlegroups.com> <54bfbc1d$0$13007$c3e8da3$5496439d@news.astraweb.com> |
|---|---|
| From | Ian Kelly <ian.g.kelly@gmail.com> |
| Date | 2015-01-21 09:15 -0700 |
| Subject | Re: Trees |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.17917.1421857401.18130.python-list@python.org> (permalink) |
On Wed, Jan 21, 2015 at 7:47 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>> More irksome that for the second we've to preface with
>>
>> from collections import Counter
>>
>> And still more a PITA that a straightforward standard name like bag (or
>> multiset) is called by such an ungoogleable misleading name as counter.
>
> It is not misleading. collections.Counter is a class for counting (i.e. a
> counter), not a bag. It is merely *similar* to a bag, but the API is not
> the same as the usual bag API because the motivating design is not the same
> as for bags.
To expand on this, Counter does provide a few multiset operations like
union and intersection. But there are a number of cases where it falls
short or does not perform as expected. To name a few:
* There are no subset or superset comparison operations provided.
* len(counter) returns the number of keys, not the number of elements.
* Unlike a multiset, Counters can have negative or zero counts; one
curious consequence of this is that an "empty" Counter can have a
non-zero length and evaluate as true in boolean contexts.
A while back I fiddled around with creating a true MultiSet class
using a Counter as the underlying implementation. Here's what I came
up with:
from collections import Counter
from collections.abc import MutableSet, Set
__all__ = ['MultiSet']
class MultiSet(MutableSet):
"""Multiset class containing hashable items."""
# Class invariants:
# * self._counter is a Counter s/t all values are positive ints denoting
# multiplicity.
# * set(self) == self._counter.keys()
def __init__(self, iterable_or_mapping=()):
"""Create a new, empty MultiSet. If passed an iterable argument, initialize
the MultiSet with items from the iterable. If passed a mapping argument,
initialize the MultiSet with the keys of the mapping, each repeated a number
of times equal to the associated value.
"""
self._counter = +Counter(iterable_or_mapping)
@classmethod
def _from_counter(cls, counter):
self = cls.__new__(cls)
self._counter = counter
return self
def __contains__(self, item):
return item in self._counter
def __hash__(self):
raise TypeError('unhashable type: %s' % type(self))
def __iter__(self):
return self._counter.elements()
def __len__(self):
# TODO: consider caching the length.
return sum(self._counter.values())
def __repr__(self):
if self:
return 'MultiSet(%r)' % list(self)
return 'MultiSet()'
def add(self, item):
"""Add an element to the MultiSet."""
self._counter[item] += 1
def clear(self):
"""Remove all elements from the MultiSet, leaving it empty."""
self._counter.clear()
def count(self, item):
"""Return the number of occurrences of the element."""
return self._counter[item]
def discard(self, item):
"""Remove an element from the MultiSet.
If there are multiple occurrences, remove only one such occurrence. If the
element is not a member, do nothing.
"""
if item not in self._counter:
return
if self._counter[item] == 1:
del self._counter[item]
else:
self._counter[item] -= 1
def __or__(self, other):
"""Return the union of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
return self._from_counter(self._counter | _get_counter(other))
__ror__ = __or__
def __ior__(self, other):
"""Perform the in-place union of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
self._counter |= _get_counter(other)
return self
def __and__(self, other):
"""Return the intersection of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
return self._from_counter(self._counter & _get_counter(other))
__rand__ = __and__
def __iand__(self, other):
"""Perform the in-place intersection of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
self._counter &= _get_counter(other)
return self
def __sub__(self, other):
"""Return the difference of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
return self._from_counter(self._counter - _get_counter(other))
def __rsub__(self, other):
"""Return the difference of another set and the MultiSet."""
if not isinstance(other, Set):
return NotImplemented
return self._from_counter(_get_counter(other) - self._counter)
def __isub__(self, other):
"""Perform the in-place set difference of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
self._counter -= _get_counter(other)
return self
def __xor__(self, other):
"""Return the symmetric difference of the MultiSet and another set."""
if not isinstance(other, Set):
return NotImplemented
# X ^ Y == (X - Y) | (Y - X)
other_counter = _get_counter(other)
counter = (self._counter - other_counter) | (other_counter - self._counter)
return self._from_counter(counter)
__rxor__ = __xor__
def __ixor__(self, other):
"Perform the in-place symmetric difference of the MultiSet and another set."
if not isinstance(other, Set):
return NotImplemented
# X ^ Y == (X - Y) | (Y - X)
other_counter = _get_counter(other)
other_diff = other_counter - self._counter
self._counter -= other_counter
self._counter |= other_diff
return self
def __eq__(self, other):
"""Return whether the MultiSet and the other set have the same contents."""
if not isinstance(other, Set):
return NotImplemented
return self._counter == _get_counter(other)
def __ne__(self, other):
"""Return whether the MultiSet and the other set have different contents."""
if not isinstance(other, Set):
return NotImplemented
return self._counter != _get_counter(other)
def __le__(self, other):
"""Return whether the MultiSet is a subset of the other set."""
if not isinstance(other, Set):
return NotImplemented
other_counter = _get_counter(other)
return all(self._counter[x] <= other_counter[x] for x in self._counter)
def __lt__(self, other):
"""Return whether the MultiSet is a proper subset of the other set."""
if not isinstance(other, Set):
return NotImplemented
return self.__le__(other) and not self.__eq__(other)
def __ge__(self, other):
"""Return whether the MultiSet is a superset of the other set."""
if not isinstance(other, Set):
return NotImplemented
other_counter = _get_counter(other)
return all(self._counter[x] >= other_counter[x] for x in other_counter)
def __gt__(self, other):
"""Return whether the MultiSet is a proper superset of the other set."""
if not isinstance(other, Set):
return NotImplemented
return self.__ge__(other) and not self.__eq__(other)
def _get_counter(maybe_multiset):
if isinstance(maybe_multiset, MultiSet):
return maybe_multiset._counter
else:
return Counter(maybe_multiset)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Re: Trees Terry Reedy <tjreedy@udel.edu> - 2015-01-20 01:08 -0500
Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 11:45 +0200
Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-20 10:14 -0800
Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 22:26 +0200
Re: Trees Stephen Hansen <me+python@ixokai.io> - 2015-01-20 23:56 -0800
Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-21 10:35 +0200
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 04:09 -0800
Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-21 23:35 +1100
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 07:24 -0800
Re: Trees Tim Chase <python.list@tim.thechases.com> - 2015-01-21 06:55 -0600
Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-22 00:01 +1100
Re: Trees Tim Chase <python.list@tim.thechases.com> - 2015-01-21 08:26 -0600
Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-22 01:31 +1100
Re: Trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-22 01:47 +1100
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 09:15 -0700
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 10:27 -0700
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 05:33 -0800
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 05:51 -0800
Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 16:15 +0200
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 06:35 -0800
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-20 10:19 -0700
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 10:15 -0800
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 10:35 -0800
Re: Trees Mario <marfig@gmail.com> - 2015-01-20 22:47 +0100
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 17:23 -0800
Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-20 17:49 -0800
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 18:03 -0800
Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-21 14:27 -0800
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 21:17 -0800
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 15:54 -0700
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 21:20 -0800
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-22 00:01 -0700
Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 23:56 -0700
Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-21 23:16 -0800
Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-22 08:54 -0800
Re: Trees Terry Reedy <tjreedy@udel.edu> - 2015-01-20 21:19 -0500
Re: Trees Mario Figueiredo <marfig@gmail.com> - 2015-01-21 14:05 +0000
csiph-web