Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #55418 > unrolled thread
| Started by | macker <tester.testerus@gmail.com> |
|---|---|
| First post | 2013-10-03 09:12 -0700 |
| Last post | 2013-10-05 17:56 -0400 |
| Articles | 8 — 5 participants |
Back to article view | Back to comp.lang.python
feature requests macker <tester.testerus@gmail.com> - 2013-10-03 09:12 -0700
Re: feature requests Chris Angelico <rosuav@gmail.com> - 2013-10-04 02:21 +1000
Re: feature requests Tim Chase <python.list@tim.thechases.com> - 2013-10-03 11:42 -0500
Re: feature requests Chris Angelico <rosuav@gmail.com> - 2013-10-04 02:42 +1000
Re: feature requests Ethan Furman <ethan@stoneleaf.us> - 2013-10-03 10:01 -0700
Re: feature requests macker <tester.testerus@gmail.com> - 2013-10-05 05:49 -0700
Re: feature requests Ethan Furman <ethan@stoneleaf.us> - 2013-10-05 08:58 -0700
Re: feature requests Terry Reedy <tjreedy@udel.edu> - 2013-10-05 17:56 -0400
| From | macker <tester.testerus@gmail.com> |
|---|---|
| Date | 2013-10-03 09:12 -0700 |
| Subject | feature requests |
| Message-ID | <6782f295-1885-4114-aea8-d785480f3489@googlegroups.com> |
Hi, hope this is the right group for this:
I miss two basic (IMO) features in parallel processing:
1. make `threading.Thread.start()` return `self`
I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines:
workers = []
for params in whatever:
thread = threading.Thread(params)
thread.start()
workers.append(thread)
2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues
As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows up RAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks).
Or does anyone know of a way to achieve this?
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-10-04 02:21 +1000 |
| Message-ID | <mailman.680.1380817277.18130.python-list@python.org> |
| In reply to | #55418 |
On Fri, Oct 4, 2013 at 2:12 AM, macker <tester.testerus@gmail.com> wrote: > I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines: > > workers = [] > for params in whatever: > thread = threading.Thread(params) > thread.start() > workers.append(thread) You could shorten this by iterating twice, if that helps: workers = [Thread(params).start() for params in whatever] for thrd in workers: thrd.start() ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2013-10-03 11:42 -0500 |
| Message-ID | <mailman.681.1380818429.18130.python-list@python.org> |
| In reply to | #55418 |
On 2013-10-04 02:21, Chris Angelico wrote:
> > workers = []
> > for params in whatever:
> > thread = threading.Thread(params)
> > thread.start()
> > workers.append(thread)
>
> You could shorten this by iterating twice, if that helps:
>
> workers = [Thread(params).start() for params in whatever]
> for thrd in workers: thrd.start()
Do you mean
workers = [Thread(params) for params in whatever]
for thrd in workers: thrd.start()
? ("Thread(params)" vs. "Thread(params).start()" in your list comp)
-tkc
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-10-04 02:42 +1000 |
| Message-ID | <mailman.683.1380818574.18130.python-list@python.org> |
| In reply to | #55418 |
On Fri, Oct 4, 2013 at 2:42 AM, Tim Chase <python.list@tim.thechases.com> wrote:
> Do you mean
>
> workers = [Thread(params) for params in whatever]
> for thrd in workers: thrd.start()
>
> ? ("Thread(params)" vs. "Thread(params).start()" in your list comp)
Whoops, copy/paste fail. Yes, that's what I meant.
Thanks for catching!
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2013-10-03 10:01 -0700 |
| Message-ID | <mailman.687.1380822918.18130.python-list@python.org> |
| In reply to | #55418 |
On 10/03/2013 09:12 AM, macker wrote: > Hi, hope this is the right group for this: > > I miss two basic (IMO) features in parallel processing: > > 1. make `threading.Thread.start()` return `self` > > I'd like to be able to `workers = [Thread(params).start() for params in whatever]`. Right now, it's 5 ugly, menial lines: > > workers = [] > for params in whatever: > thread = threading.Thread(params) > thread.start() > workers.append(thread) Ugly, menial lines are a clue that a function to hide it could be useful. > 2. make multiprocessing pools (incl. ThreadPool) limit the size of their internal queues > > As it is now, the queue will greedily consume its entire input, and if the input is large and the pool workers are slow in consuming it, this blows up RAM. I'd like to be able to `pool = Pool(4, max_qsize=1000)`. Same with the output queue (finished tasks). Have you verified that this is a problem in Python? > Or does anyone know of a way to achieve this? You could try subclassing. -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | macker <tester.testerus@gmail.com> |
|---|---|
| Date | 2013-10-05 05:49 -0700 |
| Message-ID | <ca517a69-edc7-468f-9e9e-94e61ea1014c@googlegroups.com> |
| In reply to | #55429 |
> > Ugly, menial lines are a clue that a function to hide it could be useful. Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`). > Have you verified that this is a problem in Python? ? > You could try subclassing. I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time. Thanks to Chris for his suggestion. Ethan, please stay away from this thread. -macker > > > > -- > > ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2013-10-05 08:58 -0700 |
| Message-ID | <mailman.753.1380991655.18130.python-list@python.org> |
| In reply to | #56182 |
On 10/05/2013 05:49 AM, macker wrote: >> >> Ugly, menial lines are a clue that a function to hide it could be useful. > > Or a clue to add a trivial change elsewhere (hint for Ethan: `return self` at the end of `Thread.start()`). I'm aware that would solve your issue. I'm also aware that Python rarely does a 'return self' at the end of methods. Since that probably isn't going to change, a helper function is probably your best way forward. >> Have you verified that this is a problem in Python? > > ? You stated it "would blow up RAM" -- have you actually tested this, or are you making assumptions based on experience from other languages, or assumptions based on nothing at all? >> You could try subclassing. > > I could try many things. What this thread is about is trying to fix it on stdlib level, so that people don't have to reinvent the wheel every time. Did you really expect your idea to just sail through with no opposition, no counter-ideas, no reasons why it might not, or would not, work? > Thanks to Chris for his suggestion. Ethan, please stay away from this thread. Wow, you're rude. -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-10-05 17:56 -0400 |
| Message-ID | <mailman.761.1381010189.18130.python-list@python.org> |
| In reply to | #56182 |
On 10/5/2013 11:58 AM, Ethan Furman wrote: > On 10/05/2013 05:49 AM, macker wrote: >>> >>> Ugly, menial lines are a clue that a function to hide it could be >>> useful. >> >> Or a clue to add a trivial change elsewhere (hint for Ethan: `return >> self` at the end of `Thread.start()`). > > I'm aware that would solve your issue. I'm also aware that Python > rarely does a 'return self' at the end of methods. Not returning self is a basic design principle of Python since its beginning. (I am not aware of any exceptions and would regard one as possibly a mistake.) Guido is aware that not doing so prevents chaining of mutation methods. He thinks it very important that people know and remember the difference between a method that mutates self and one that does not. Otherwise, one could write 'b = a.sort()' and not know (remember) that b is just an alias for a. He must have seen this type of error, especially in beginner code, in other languages before designing Python. > Since that probably isn't going to change, as it would only make things worse. Note that some mutation methods also return something useful other than default None. Examples are mylist.pop() and iterator.__next__ (usually accessed by next(iterator)*. So it is impossible for all mutation methods to just 'return self'. * iterator.__next__ is a generalized specialization of list.pop. It can only return the 'first' item, but can do so with any iterable, including those that are not ordered and those that represent virtual rather than concrete collections. -- Terry Jan Reedy
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web