Groups > comp.lang.python > #27654 > unrolled thread

Filter versus comprehension (was Re: something about split()???)

Started by	Terry Reedy <tjreedy@udel.edu>
First post	2012-08-22 12:43 -0400
Last post	2012-08-24 07:44 -0700
Articles	15 — 9 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Filter versus comprehension (was Re: something about split()???) Terry Reedy <tjreedy@udel.edu> - 2012-08-22 12:43 -0400
    Re: Filter versus comprehension (was Re: something about split()???) Ramchandra Apte <maniandram01@gmail.com> - 2012-08-24 07:44 -0700
      Re: Filter versus comprehension (was Re: something about split()???) Terry Reedy <tjreedy@udel.edu> - 2012-08-24 12:04 -0400
      Re: Filter versus comprehension (was Re: something about split()???) Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-08-24 14:29 -0400
        Re: Filter versus comprehension (was Re: something about split()???) Walter Hurry <walterhurry@lavabit.com> - 2012-08-24 19:03 +0000
          Re: Filter versus comprehension (was Re: something about split()???) Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-08-24 17:56 -0400
            Re: Filter versus comprehension (was Re: something about split()???) Walter Hurry <walterhurry@lavabit.com> - 2012-08-24 22:55 +0000
          Re: Filter versus comprehension (was Re: something about split()???) Terry Reedy <tjreedy@udel.edu> - 2012-08-24 18:03 -0400
          Re: Filter versus comprehension (was Re: something about split()???) Emile van Sebille <emile@fenx.com> - 2012-08-24 15:15 -0700
          Re: Filter versus comprehension (was Re: something about split()???) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-08-24 23:28 +0100
          Re: Filter versus comprehension (was Re: something about split()???) Ned Deily <nad@acm.org> - 2012-08-24 15:36 -0700
          Re: Filter versus comprehension (was Re: something about split()???) Ned Deily <nad@acm.org> - 2012-08-24 15:39 -0700
          Re: Filter versus comprehension (was Re: something about split()???) David Robinow <drobinow@gmail.com> - 2012-08-25 08:57 -0400
          Re: Filter versus comprehension (was Re: something about split()???) Tim Golden <mail@timgolden.me.uk> - 2012-08-25 16:31 +0100
    Re: Filter versus comprehension (was Re: something about split()???) Ramchandra Apte <maniandram01@gmail.com> - 2012-08-24 07:44 -0700

#27654 — Filter versus comprehension (was Re: something about split()???)

From	Terry Reedy <tjreedy@udel.edu>
Date	2012-08-22 12:43 -0400
Subject	Filter versus comprehension (was Re: something about split()???)
Message-ID	<mailman.3665.1345653816.4697.python-list@python.org>

On 8/22/2012 3:30 AM, Mark Lawrence wrote:
> On 22/08/2012 06:46, Terry Reedy wrote:
>> On 8/21/2012 11:43 PM, mingqiang hu wrote:
>>> why filter is bad when use lambda ?
>>
>> Inefficient, not 'bad'. Because the equivalent comprehension or
>> generator expression does not require a function call.

for each item in the iterable.

> A case of premature optimisation? :)

No, as regards my post. I simply made a factual statement without 
advocating a particular action.

filter(lambda x: <expr>, iterable)
(x for x in iterable if <expr>)

both create iterators that produce the items in iterable such that 
bool(<expr>) is true. The following, with output rounded, shows 
something of the effect of the extra function call.

 >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
0.91
 >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
1.28
 >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
"ranger=range(0)")
0.83
 >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
"ranger=range(20)")
2.60

Simply keeping true items is faster with filter -- at least on my 
particular machine with 3.3.0b2.

 >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")
1.03

Filter is also faster if the expression is a function call.

 >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20); 
f=lambda i: False")
2.5033614114454394
 >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20); 
f=lambda i: False")
3.2394095327040304

---
Perhaps or even yes as regards the so-called rule 'always use 
comprehension'. If one prefers filter as more readable, if one only 
wants to keep true items, if the expression is a function call, if 
evaluating the expression takes much more time than the extra function 
call so the latter does not matter, if the number of items is few enough 
that the extra time does not matter, then the rule is not needed or even 
wrong.

So I think PyLint should be changed to stop its filter fud.

-- 
Terry Jan Reedy

[toc] | [next] | [standalone]

#27804

From	Ramchandra Apte <maniandram01@gmail.com>
Date	2012-08-24 07:44 -0700
Message-ID	<960e4798-745b-4e9b-aedb-14aae986d086@googlegroups.com>
In reply to	#27654

On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy  wrote:
> On 8/22/2012 3:30 AM, Mark Lawrence wrote:
> 
> > On 22/08/2012 06:46, Terry Reedy wrote:
> 
> >> On 8/21/2012 11:43 PM, mingqiang hu wrote:
> 
> >>> why filter is bad when use lambda ?
> 
> >>
> 
> >> Inefficient, not 'bad'. Because the equivalent comprehension or
> 
> >> generator expression does not require a function call.
> 
> 
> 
> for each item in the iterable.
> 
> 
> 
> > A case of premature optimisation? :)
> 
> 
> 
> No, as regards my post. I simply made a factual statement without 
> 
> advocating a particular action.
> 
> 
> 
> filter(lambda x: <expr>, iterable)
> 
> (x for x in iterable if <expr>)
> 
> 
> 
> both create iterators that produce the items in iterable such that 
> 
> bool(<expr>) is true. The following, with output rounded, shows 
> 
> something of the effect of the extra function call.
> 
> 
> 
>  >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
> 
> 0.91
> 
>  >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
> 
> 1.28
> 
>  >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
> 
> "ranger=range(0)")
> 
> 0.83
> 
>  >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
> 
> "ranger=range(20)")
> 
> 2.60
> 
> 
> 
> Simply keeping true items is faster with filter -- at least on my 
> 
> particular machine with 3.3.0b2.
> 
> 
> 
>  >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")
> 
> 1.03
> 
> 
> 
> Filter is also faster if the expression is a function call.
> 
> 
> 
>  >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20); 
> 
> f=lambda i: False")
> 
> 2.5033614114454394
> 
>  >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20); 
> 
> f=lambda i: False")
> 
> 3.2394095327040304
> 
> 
> 
> ---
> 
> Perhaps or even yes as regards the so-called rule 'always use 
> 
> comprehension'. If one prefers filter as more readable, if one only 
> 
> wants to keep true items, if the expression is a function call, if 
> 
> evaluating the expression takes much more time than the extra function 
> 
> call so the latter does not matter, if the number of items is few enough 
> 
> that the extra time does not matter, then the rule is not needed or even 
> 
> wrong.
> 
> 
> 
> So I think PyLint should be changed to stop its filter fud.
> 
> 
> 
> -- 
> 
> Terry Jan Reedy

When filtering for true values, filter(None,xxx) can be used
Your examples with lambda i:False are unrealistic - you are comparing `if False` vs <lambda function>(xx) - function call vs boolean check

[toc] | [prev] | [next] | [standalone]

#27808

From	Terry Reedy <tjreedy@udel.edu>
Date	2012-08-24 12:04 -0400
Message-ID	<mailman.3759.1345824330.4697.python-list@python.org>
In reply to	#27804

On 8/24/2012 10:44 AM, Ramchandra Apte wrote:
> On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy  wrote:

>>   >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
 >>
>> 0.91
>>
>>   >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
>>
>> 1.28
>>
>>   >>> timeit.timeit("list(filter(lambda i: False, ranger))",
>>
>> "ranger=range(0)")
>>
>> 0.83
>>
>>   >>> timeit.timeit("list(filter(lambda i: False, ranger))",
>>
>> "ranger=range(20)")
>>
>> 2.60

Your mail agent in inserting blank lines in quotes -- google?
See if you can turn that off.

> Your examples with lambda i:False are unrealistic - you are comparing
 > `if False` vs <lambda function>(xx) - function call vs boolean check

That is exactly the comparison I wanted to make. The iteration + boolean 
check takes .37 for 20 items, the iteration + call 1.77.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#27817

From	Dennis Lee Bieber <wlfraed@ix.netcom.com>
Date	2012-08-24 14:29 -0400
Message-ID	<mailman.3764.1345832961.4697.python-list@python.org>
In reply to	#27804

On Fri, 24 Aug 2012 12:04:54 -0400, Terry Reedy <tjreedy@udel.edu>
declaimed the following in gmane.comp.python.general:


> 
> Your mail agent in inserting blank lines in quotes -- google?
> See if you can turn that off.
>
	It appears to be a change Google made in the last month or two... My
hypothesis is that they are replacing hard EOL found in inbound NNTP
with an HTML <p>, and then on outgoing replacing the <p> with a pair of
NNTP line endings. In contrast, text composed on Google is coming in as
long single lines (since quoting said text in a response produces on a
">" at the start of the paragraph.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]

#27820

From	Walter Hurry <walterhurry@lavabit.com>
Date	2012-08-24 19:03 +0000
Message-ID	<k18j6n$k6h$1@news.albasani.net>
In reply to	#27817

On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:

> It appears to be a change Google made in the last month or two... My
> hypothesis is that they are replacing hard EOL found in inbound NNTP
> with an HTML <p>, and then on outgoing replacing the <p> with a pair of
> NNTP line endings. In contrast, text composed on Google is coming in as
> long single lines (since quoting said text in a response produces on a
> ">" at the start of the paragraph.

Google Groups sucks. These are computer literate people here. Why don't 
they just use a proper newsreader?

[toc] | [prev] | [next] | [standalone]

#27833

From	Dennis Lee Bieber <wlfraed@ix.netcom.com>
Date	2012-08-24 17:56 -0400
Message-ID	<mailman.3778.1345845419.4697.python-list@python.org>
In reply to	#27820

On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
<walterhurry@lavabit.com> declaimed the following in
gmane.comp.python.general:

> 
> Google Groups sucks. These are computer literate people here. Why don't 
> they just use a proper newsreader?

	Probably because their ISP doesn't offer a free server <G>

-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]

#27841

From	Walter Hurry <walterhurry@lavabit.com>
Date	2012-08-24 22:55 +0000
Message-ID	<k190pd$f84$1@news.albasani.net>
In reply to	#27833

On Fri, 24 Aug 2012 17:56:47 -0400, Dennis Lee Bieber wrote:

> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
> <walterhurry@lavabit.com> declaimed the following in
> gmane.comp.python.general:
> 
> 
>> Google Groups sucks. These are computer literate people here. Why don't
>> they just use a proper newsreader?
> 
> 	Probably because their ISP doesn't offer a free server <G>

There are plenty of free Usenet providers.

[toc] | [prev] | [next] | [standalone]

#27834

From	Terry Reedy <tjreedy@udel.edu>
Date	2012-08-24 18:03 -0400
Message-ID	<mailman.3779.1345845841.4697.python-list@python.org>
In reply to	#27820

On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
> <walterhurry@lavabit.com> declaimed the following in
> gmane.comp.python.general:
>
>>
>> Google Groups sucks. These are computer literate people here. Why don't
>> they just use a proper newsreader?
>
> 	Probably because their ISP doesn't offer a free server <G>

Python lists are available on the free gmane mail-to-news server.



-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#27835

From	Emile van Sebille <emile@fenx.com>
Date	2012-08-24 15:15 -0700
Message-ID	<mailman.3780.1345846440.4697.python-list@python.org>
In reply to	#27820

On 8/24/2012 3:03 PM Terry Reedy said...
> On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
>> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
>> <walterhurry@lavabit.com> declaimed the following in
>> gmane.comp.python.general:
>>
>>>
>>> Google Groups sucks. These are computer literate people here. Why don't
>>> they just use a proper newsreader?
>>
>>     Probably because their ISP doesn't offer a free server <G>
>
> Python lists are available on the free gmane mail-to-news server.

I'm getting high load related denials with the gmane connections a lot 
recently so I'm open to alternatives.

Suggestions or recommendations?


Emile

[toc] | [prev] | [next] | [standalone]

#27837

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2012-08-24 23:28 +0100
Message-ID	<mailman.3781.1345847279.4697.python-list@python.org>
In reply to	#27820

On 24/08/2012 23:03, Terry Reedy wrote:
> On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
>> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
>> <walterhurry@lavabit.com> declaimed the following in
>> gmane.comp.python.general:
>>
>>>
>>> Google Groups sucks. These are computer literate people here. Why don't
>>> they just use a proper newsreader?
>>
>>     Probably because their ISP doesn't offer a free server <G>
>
> Python lists are available on the free gmane mail-to-news server.
>

I don't think the core-mentorship list is available on gmane.  Have I 
missed it, has nobody asked for it to go on there or what?


-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]

#27839

From	Ned Deily <nad@acm.org>
Date	2012-08-24 15:36 -0700
Message-ID	<mailman.3782.1345847808.4697.python-list@python.org>
In reply to	#27820

In article <k18uat$9ns$1@ger.gmane.org>,
 Emile van Sebille <emile@fenx.com> wrote:
> On 8/24/2012 3:03 PM Terry Reedy said...
> > Python lists are available on the free gmane mail-to-news server.
> I'm getting high load related denials with the gmane connections a lot 
> recently so I'm open to alternatives.

The high load denials should be a thing of the past as the gmane NNTP 
server was very recently upgraded to use SSDs instead of standard disks.

-- 
 Ned Deily,
 nad@acm.org

[toc] | [prev] | [next] | [standalone]

#27840

From	Ned Deily <nad@acm.org>
Date	2012-08-24 15:39 -0700
Message-ID	<mailman.3783.1345848307.4697.python-list@python.org>
In reply to	#27820

In article <k18v53$hgs$1@ger.gmane.org>,
 Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> I don't think the core-mentorship list is available on gmane.  Have I 
> missed it, has nobody asked for it to go on there or what?

core-mentorship is a closed list so it would not be appropriate for it 
to be mirrored anywhere.

http://mail.python.org/mailman/listinfo/core-mentorship

-- 
 Ned Deily,
 nad@acm.org

[toc] | [prev] | [next] | [standalone]

#27872

From	David Robinow <drobinow@gmail.com>
Date	2012-08-25 08:57 -0400
Message-ID	<mailman.3802.1345899474.4697.python-list@python.org>
In reply to	#27820

On Fri, Aug 24, 2012 at 3:03 PM, Walter Hurry <walterhurry@lavabit.com> wrote:
> On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:
>
>> It appears to be a change Google made in the last month or two... My
>> hypothesis is that they are replacing hard EOL found in inbound NNTP
>> with an HTML <p>, and then on outgoing replacing the <p> with a pair of
>> NNTP line endings. In contrast, text composed on Google is coming in as
>> long single lines (since quoting said text in a response produces on a
>> ">" at the start of the paragraph.
>
> Google Groups sucks. These are computer literate people here. Why don't
> they just use a proper newsreader?
I haven't used a newsreader in over a decade. I'm quite happy with a
mailing list. Am I missing something?

[toc] | [prev] | [next] | [standalone]

#27875

From	Tim Golden <mail@timgolden.me.uk>
Date	2012-08-25 16:31 +0100
Message-ID	<mailman.3804.1345908679.4697.python-list@python.org>
In reply to	#27820

On 25/08/2012 13:57, David Robinow wrote:
> On Fri, Aug 24, 2012 at 3:03 PM, Walter Hurry <walterhurry@lavabit.com> wrote:
>> On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:
>>
>>> It appears to be a change Google made in the last month or two... My
>>> hypothesis is that they are replacing hard EOL found in inbound NNTP
>>> with an HTML <p>, and then on outgoing replacing the <p> with a pair of
>>> NNTP line endings. In contrast, text composed on Google is coming in as
>>> long single lines (since quoting said text in a response produces on a
>>> ">" at the start of the paragraph.
>>
>> Google Groups sucks. These are computer literate people here. Why don't
>> they just use a proper newsreader?
> I haven't used a newsreader in over a decade. I'm quite happy with a
> mailing list. Am I missing something?

Not really. I'm the same; it just means you can skip over the occasional 
ggroups-newsreader discussion threads which pop up
about 3 times a year on average.

:)

TJG

[toc] | [prev] | [next] | [standalone]

#27805

From	Ramchandra Apte <maniandram01@gmail.com>
Date	2012-08-24 07:44 -0700
Message-ID	<mailman.3757.1345819477.4697.python-list@python.org>
In reply to	#27654

On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy  wrote:
> On 8/22/2012 3:30 AM, Mark Lawrence wrote:
> 
> > On 22/08/2012 06:46, Terry Reedy wrote:
> 
> >> On 8/21/2012 11:43 PM, mingqiang hu wrote:
> 
> >>> why filter is bad when use lambda ?
> 
> >>
> 
> >> Inefficient, not 'bad'. Because the equivalent comprehension or
> 
> >> generator expression does not require a function call.
> 
> 
> 
> for each item in the iterable.
> 
> 
> 
> > A case of premature optimisation? :)
> 
> 
> 
> No, as regards my post. I simply made a factual statement without 
> 
> advocating a particular action.
> 
> 
> 
> filter(lambda x: <expr>, iterable)
> 
> (x for x in iterable if <expr>)
> 
> 
> 
> both create iterators that produce the items in iterable such that 
> 
> bool(<expr>) is true. The following, with output rounded, shows 
> 
> something of the effect of the extra function call.
> 
> 
> 
>  >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")
> 
> 0.91
> 
>  >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")
> 
> 1.28
> 
>  >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
> 
> "ranger=range(0)")
> 
> 0.83
> 
>  >>> timeit.timeit("list(filter(lambda i: False, ranger))", 
> 
> "ranger=range(20)")
> 
> 2.60
> 
> 
> 
> Simply keeping true items is faster with filter -- at least on my 
> 
> particular machine with 3.3.0b2.
> 
> 
> 
>  >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")
> 
> 1.03
> 
> 
> 
> Filter is also faster if the expression is a function call.
> 
> 
> 
>  >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20); 
> 
> f=lambda i: False")
> 
> 2.5033614114454394
> 
>  >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20); 
> 
> f=lambda i: False")
> 
> 3.2394095327040304
> 
> 
> 
> ---
> 
> Perhaps or even yes as regards the so-called rule 'always use 
> 
> comprehension'. If one prefers filter as more readable, if one only 
> 
> wants to keep true items, if the expression is a function call, if 
> 
> evaluating the expression takes much more time than the extra function 
> 
> call so the latter does not matter, if the number of items is few enough 
> 
> that the extra time does not matter, then the rule is not needed or even 
> 
> wrong.
> 
> 
> 
> So I think PyLint should be changed to stop its filter fud.
> 
> 
> 
> -- 
> 
> Terry Jan Reedy

When filtering for true values, filter(None,xxx) can be used
Your examples with lambda i:False are unrealistic - you are comparing `if False` vs <lambda function>(xx) - function call vs boolean check

[toc] | [prev] | [standalone]

csiph-web

Filter versus comprehension (was Re: something about split()???)

Contents

#27654 — Filter versus comprehension (was Re: something about split()???)

#27804

#27808

#27817

#27820

#27833

#27841

#27834

#27835

#27837

#27839

#27840

#27872

#27875

#27805