Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #55418 > unrolled thread

feature requests

Started bymacker <tester.testerus@gmail.com>
First post2013-10-03 09:12 -0700
Last post2013-10-05 17:56 -0400
Articles 8 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  feature requests macker <tester.testerus@gmail.com> - 2013-10-03 09:12 -0700
    Re: feature requests Chris Angelico <rosuav@gmail.com> - 2013-10-04 02:21 +1000
    Re: feature requests Tim Chase <python.list@tim.thechases.com> - 2013-10-03 11:42 -0500
    Re: feature requests Chris Angelico <rosuav@gmail.com> - 2013-10-04 02:42 +1000
    Re: feature requests Ethan Furman <ethan@stoneleaf.us> - 2013-10-03 10:01 -0700
      Re: feature requests macker <tester.testerus@gmail.com> - 2013-10-05 05:49 -0700
        Re: feature requests Ethan Furman <ethan@stoneleaf.us> - 2013-10-05 08:58 -0700
        Re: feature requests Terry Reedy <tjreedy@udel.edu> - 2013-10-05 17:56 -0400

#55418 — feature requests

Frommacker <tester.testerus@gmail.com>
Date2013-10-03 09:12 -0700
Subjectfeature requests
Message-ID<6782f295-1885-4114-aea8-d785480f3489@googlegroups.com>
Hi, hope this is the right group for this:

I miss two basic (IMO) features in parallel processing:

1. make `threading.Thread.start()` return `self`

I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:

        workers = []
        for params in whatever:
            thread = threading.Thread(params)
            thread.start()
            workers.append(thread)

2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues

As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows up RAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks).

Or does anyone know of a way to achieve this?

[toc] | [next] | [standalone]


#55419

FromChris Angelico <rosuav@gmail.com>
Date2013-10-04 02:21 +1000
Message-ID<mailman.680.1380817277.18130.python-list@python.org>
In reply to#55418
On Fri, Oct 4, 2013 at 2:12 AM, macker <tester.testerus@gmail.com> wrote:
> I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:
>
>         workers = []
>         for params in whatever:
>             thread = threading.Thread(params)
>             thread.start()
>             workers.append(thread)

You could shorten this by iterating twice, if that helps:

workers = [Thread(params).start() for params in whatever]
for thrd in workers: thrd.start()

ChrisA

[toc] | [prev] | [next] | [standalone]


#55420

FromTim Chase <python.list@tim.thechases.com>
Date2013-10-03 11:42 -0500
Message-ID<mailman.681.1380818429.18130.python-list@python.org>
In reply to#55418
On 2013-10-04 02:21, Chris Angelico wrote:
> >         workers = []
> >         for params in whatever:
> >             thread = threading.Thread(params)
> >             thread.start()
> >             workers.append(thread)  
> 
> You could shorten this by iterating twice, if that helps:
> 
> workers = [Thread(params).start() for params in whatever]
> for thrd in workers: thrd.start()

Do you mean

  workers = [Thread(params) for params in whatever]
  for thrd in workers: thrd.start()

?  ("Thread(params)" vs. "Thread(params).start()" in your list comp)

-tkc


[toc] | [prev] | [next] | [standalone]


#55423

FromChris Angelico <rosuav@gmail.com>
Date2013-10-04 02:42 +1000
Message-ID<mailman.683.1380818574.18130.python-list@python.org>
In reply to#55418
On Fri, Oct 4, 2013 at 2:42 AM, Tim Chase <python.list@tim.thechases.com> wrote:
> Do you mean
>
>   workers = [Thread(params) for params in whatever]
>   for thrd in workers: thrd.start()
>
> ?  ("Thread(params)" vs. "Thread(params).start()" in your list comp)

Whoops, copy/paste fail. Yes, that's what I meant.

Thanks for catching!

ChrisA

[toc] | [prev] | [next] | [standalone]


#55429

FromEthan Furman <ethan@stoneleaf.us>
Date2013-10-03 10:01 -0700
Message-ID<mailman.687.1380822918.18130.python-list@python.org>
In reply to#55418
On 10/03/2013 09:12 AM, macker wrote:
> Hi, hope this is the right group for this:
>
> I miss two basic (IMO) features in parallel processing:
>
> 1. make `threading.Thread.start()` return `self`
>
> I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:
>
>          workers = []
>          for params in whatever:
>              thread = threading.Thread(params)
>              thread.start()
>              workers.append(thread)

Ugly, menial lines are a clue that a function to hide it could be useful.


> 2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues
>
> As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows up RAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks).

Have you verified that this is a problem in Python?


> Or does anyone know of a way to achieve this?

You could try subclassing.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#56182

Frommacker <tester.testerus@gmail.com>
Date2013-10-05 05:49 -0700
Message-ID<ca517a69-edc7-468f-9e9e-94e61ea1014c@googlegroups.com>
In reply to#55429
> 
> Ugly, menial lines are a clue that a function to hide it could be useful.

Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`).

> Have you verified that this is a problem in Python?

?

> You could try subclassing.

I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time.

Thanks to Chris for his suggestion. Ethan, please stay away from this thread.

-macker



> 
> 
> 
> --
> 
> ~Ethan~

[toc] | [prev] | [next] | [standalone]


#56206

FromEthan Furman <ethan@stoneleaf.us>
Date2013-10-05 08:58 -0700
Message-ID<mailman.753.1380991655.18130.python-list@python.org>
In reply to#56182
On 10/05/2013 05:49 AM, macker wrote:
>>
>> Ugly, menial lines are a clue that a function to hide it could be useful.
>
> Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`).

I'm aware that would solve your issue.  I'm also aware that Python rarely does a 'return self' at the end of methods. 
Since that probably isn't going to change, a helper function is probably your best way forward.


>> Have you verified that this is a problem in Python?
>
> ?

You stated it "would blow up RAM" -- have you actually tested this, or are you making assumptions based on experience 
from other languages, or assumptions based on nothing at all?


>> You could try subclassing.
>
> I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time.

Did you really expect your idea to just sail through with no opposition, no counter-ideas, no reasons why it might not, 
or would not, work?


> Thanks to Chris for his suggestion. Ethan, please stay away from this thread.

Wow, you're rude.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#56222

FromTerry Reedy <tjreedy@udel.edu>
Date2013-10-05 17:56 -0400
Message-ID<mailman.761.1381010189.18130.python-list@python.org>
In reply to#56182
On 10/5/2013 11:58 AM, Ethan Furman wrote:
> On 10/05/2013 05:49 AM, macker wrote:
>>>
>>> Ugly, menial lines are a clue that a function to hide it could be
>>> useful.
>>
>> Or a clue to add a trivial change elsewhere (hint for Ethan: `return
>> self` at the end of `Thread.start()`).
>
> I'm aware that would solve your issue.  I'm also aware that Python
> rarely does a 'return self' at the end of methods.

Not returning self is a basic design principle of Python since its 
beginning. (I am not aware of any exceptions and would regard one as 
possibly a mistake.) Guido is aware that not doing so prevents chaining 
of mutation methods. He thinks it very important that people know and 
remember the difference between a method that mutates self and one that 
does not. Otherwise, one could write 'b = a.sort()' and not know 
(remember) that b is just an alias for a. He must have seen this type of 
error, especially in beginner code, in other languages before designing 
Python.

 > Since that probably isn't going to change,

as it would only make things worse.

Note that some mutation methods also return something useful other than 
default None. Examples are mylist.pop() and iterator.__next__ (usually 
accessed by next(iterator)*. So it is impossible for all mutation 
methods to just 'return self'.

* iterator.__next__ is a generalized specialization of list.pop. It can 
only return the 'first' item, but can do so with any iterable, including 
those that are not ordered and those that represent virtual rather than 
concrete collections.

-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web