Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Peter Otten <__peter__@web.de>
Subject: Re: itertools py3.4 - filter list using not equal - fails as bool
Date: Wed, 13 May 2015 09:40:05 +0200
Organization: None
References: <05defef5-74aa-4a5d-b7e7-9b521512152c@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit
User-Agent: KNode/4.13.3
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.435.1431502814.12865.python-list@python.org>
Lines: 112
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:90537

Sayth Renshaw wrote:

> why can't I filter a list based on an itertools condition using dropwhile?
> 
> This is the docs and the example.
> https://docs.python.org/3/library/itertools.html#itertools.dropwhile
> 
> def less_than_10(x):
>     return x < 10
> 
> itertools.takewhile(less_than_10, itertools.count()) =>
>   0, 1, 2, 3, 4, 5, 6, 7, 8, 9

As the example demonstrates dropwhile() takes a function as its first 
argument. To apply it to a sequence of tuples you could write

>>> items
[(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]
>>> def first_is_one(t):
...     return t[0] == 1
... 
>>> list(itertools.dropwhile(first_is_one, items))
[(2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]

However, if not all items where the predicate function evaluates to True are 
at the beginning of the sequence:

>>> def second_is_one(t):
...     return t[1] == 1
... 
>>> list(itertools.dropwhile(second_is_one, items))
[(1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]

If you want only the items where t[1] != 1 you need filter() or 
itertools.filterfalse() (Python 2: itertools.ifilter or 
itertools.ifilterfalse)

>>> list(itertools.filterfalse(second_is_one, items))
[(1, 2), (1, 3), (2, 2), (2, 3), (3, 2), (3, 3)]

> so I have a list I have created (converted from itertools). pm is
> permutations
> 
> 
> print(stats)
> [(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 2), (1, 3, 4), (1, 3, 5), (1, 4,
> [2), (1, 4, 3), (1, 4, 5), (1, 5, 2), (1, 5, 3), (1, 5, 4), (2, 1, 3), (2,
> [1, 4), (2, 1, 5), (2, 3, 1), (2, 3, 4), (2, 3, 5), (2, 4, 1), (2, 4, 3),
> [(2, 4, 5), (2, 5, 1), (2, 5, 3), (2, 5, 4), (3, 1, 2), (3, 1, 4), (3, 1,
> [5), (3, 2, 1), (3, 2, 4), (3, 2, 5), (3, 4, 1), (3, 4, 2), (3, 4, 5), (3,
> [5, 1), (3, 5, 2), (3, 5, 4), (4, 1, 2), (4, 1, 3), (4, 1, 5), (4, 2, 1),
> [(4, 2, 3), (4, 2, 5), (4, 3, 1), (4, 3, 2), (4, 3, 5), (4, 5, 1), (4, 5,
> [2), (4, 5, 3), (5, 1, 2), (5, 1, 3), (5, 1, 4), (5, 2, 1), (5, 2, 3), (5,
> [2, 4), (5, 3, 1), (5, 3, 2), (5, 3, 4), (5, 4, 1), (5, 4, 2), (5, 4, 3)]
> 
> 
> I simply wanted to create an easy way to create summary stats of my stats
> list(poorly named). So in this case easy to check answers. so how many
> tuples in my list have a 1 in item[0] and how many don't. Then hoping to
> build on that for example how many have item[0] == 1 && (item[1] == 2 or
> item[1] == 4) etc.
> 
> I can achieve it via an else if but that would become ugly quick.
> 
> for item in stats:
>     if item[0] == 1:
>         nums += 1
>     elif item[0] != 1:
>         not_in += 1
>     else:
>         pass

Why would this become ugly? You can reshuffle it a bit to make it more 
general:

>>> def count(items):
...     t = f = 0
...     for item in items:
...         if item:
...             t += 1
...         else:
...             f += 1
...     return f, t
... 
>>> count(t[1] == 1 for t in stats)
(48, 12)
>>> count(t[0] == 1 for t in stats)
(48, 12)
>>> count(t[0] in (1, 2) for t in stats)
(36, 24)
>>> count(sum(t) == 6 for t in stats)
(54, 6)

Alternatively collections.Counter() supports an arbitrary number of bins...

>>> import collections
>>> freq = collections.Counter(t[1] for t in stats)
>>> freq
Counter({1: 12, 2: 12, 3: 12, 4: 12, 5: 12})

...but you can easily reduce them:

>>> freq = collections.Counter(t[1] == 1 for t in stats)
>>> freq
Counter({False: 48, True: 12})
>>> one = freq[True]
>>> total = sum(freq.values())
>>> print("{} of {} ({}%) have t[1] == 1".format(one, total, one/total*100))
12 of 60 (20.0%) have t[1] == 1