Groups > comp.lang.python > #69792 > unrolled thread

Re: threading

Started by	Ben Finney <ben+python@benfinney.id.au>
First post	2014-04-07 13:05 +1000
Last post	2014-04-08 15:19 +0000
Articles	20 on this page of 105 — 22 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: threading Ben Finney <ben+python@benfinney.id.au> - 2014-04-07 13:05 +1000
    Re: threading Roy Smith <roy@panix.com> - 2014-04-06 23:48 -0400
      Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-07 13:56 +1000
        Re: threading Roy Smith <roy@panix.com> - 2014-04-07 08:26 -0400
          Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-07 22:34 +1000
            Re: threading Roy Smith <roy@panix.com> - 2014-04-07 09:22 -0400
              Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-07 14:41 +0100
              Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 16:49 +0300
                Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 00:27 +1000
                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 17:51 +0300
                    Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 01:12 +1000
              Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 00:24 +1000
        Re: threading Rick Johnson <rantingrickjohnson@gmail.com> - 2014-04-08 18:09 -0700
          Re: threading "Neil D. Cerutti" <neilc@norwich.edu> - 2014-04-09 09:50 -0400
            Re: threading Rick Johnson <rantingrickjohnson@gmail.com> - 2014-04-09 08:51 -0700
              Re: threading MRAB <python@mrabarnett.plus.com> - 2014-04-09 18:47 +0100
                Re: threading Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-10 11:35 +1200
                  Re: threading Roy Smith <roy@panix.com> - 2014-04-09 19:53 -0400
                    Re: threading Andrew Berg <robotsondrugs@gmail.com> - 2014-04-09 19:02 -0500
                    Re: threading Steven D'Aprano <steve@pearwood.info> - 2014-04-10 02:43 +0000
                      Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 13:08 +1000
                    Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-10 09:23 +0100
                    Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 19:11 +1000
              Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 04:00 +1000
              Re: threading Steven D'Aprano <steve@pearwood.info> - 2014-04-10 03:44 +0000
                Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 13:54 +1000
      Re: threading Ben Finney <ben+python@benfinney.id.au> - 2014-04-07 15:22 +1000
      Re: threading Ethan Furman <ethan@stoneleaf.us> - 2014-04-08 11:09 -0700
      Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 21:41 +0200
        Re: threading Grant Edwards <invalid@invalid.invalid> - 2014-04-08 20:30 +0000
          Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 00:32 +0200
            Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-08 19:17 -0700
    Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 08:10 +0300
      Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-06 22:39 -0700
        Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 08:46 +0300
        Re: threading Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-04-07 19:47 -0400
          Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 08:19 +0300
            Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 10:47 +0000
              Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 15:10 +0300
                Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 16:37 +0000
                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 20:17 +0300
              Re: threading Roy Smith <roy@panix.com> - 2014-04-08 09:19 -0400
                Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:44 +0000
                  Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-08 09:38 -0700
                    Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-09 14:42 +0100
            Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 15:23 +0200
              Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 16:55 +0300
                Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 16:46 +0200
                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 20:31 +0300
                    Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 03:52 +1000
                      Re: threading Mark H Harris <harrismh777@gmail.com> - 2014-04-10 08:29 -0500
                    Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 19:20 +0000
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-09 23:47 +1000
              Re: threading Roy Smith <roy@panix.com> - 2014-04-09 10:44 -0400
            Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 16:30 +0200
              Re: threading Roy Smith <roy@panix.com> - 2014-04-09 10:52 -0400
                Re: threading Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-10 11:19 +1200
              Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 19:48 +0300
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 00:44 +1000
            Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 15:29 +0000
            Re: threading Terry Reedy <tjreedy@udel.edu> - 2014-04-09 12:14 -0400
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 02:25 +1000
            Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 16:32 +0000
            Re: threading Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-04-09 19:44 -0400
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 11:05 +1000
            Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-10 11:17 +0200
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 19:40 +1000
            Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-10 13:10 +0200
              Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 14:43 +0300
                Re: threading Roy Smith <roy@panix.com> - 2014-04-10 08:56 -0400
                Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-10 15:24 +0000
                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 19:20 +0300
                Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-11 01:32 +1000
                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 19:25 +0300
                    Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-11 03:08 +1000
                      Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 11:14 -0700
                        Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 22:44 +0300
                          Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 13:21 -0700
                            Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 23:44 +0300
                              Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 22:15 -0700
                                Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 23:50 -0700
                                  Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-11 18:36 +0300
                                    Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-12 01:53 +1000
                                    Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-11 16:58 +0100
                                    Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-11 11:54 -0700
                                      Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-11 22:27 +0300
                          Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 01:51 +0200
                            Re: threading Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-11 05:35 +0000
                              Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 09:26 +0000
                              Re: threading Roy Smith <roy@panix.com> - 2014-04-11 08:36 -0400
                                Re: threading Grant Edwards <invalid@invalid.invalid> - 2014-04-11 16:18 +0000
                          Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 02:21 +0200
                          Re: threading Terry Reedy <tjreedy@udel.edu> - 2014-04-10 20:23 -0400
            Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 21:19 +1000
        Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 02:06 +0000
          Re: threading alister <alister.nospam.ware@ntlworld.com> - 2014-04-08 11:07 +0000
            Re: threading Roy Smith <roy@panix.com> - 2014-04-08 09:13 -0400
              Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 23:23 +1000
                Re: threading alister <alister.nospam.ware@ntlworld.com> - 2014-04-08 14:15 +0000
                  Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 16:06 +0000
              Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:40 +0000
                Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-08 09:46 -0700
                  Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-09 02:46 +1000
                  Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 17:17 +0000
            Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:19 +0000

Page 3 of 6 — ← Prev page 1 2 [3] 4 5 6 Next page →

#69886

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-04-08 20:17 +0300
Message-ID	<8738hnwxjx.fsf@elektro.pacujo.net>
In reply to	#69882

Sturla Molden <sturla.molden@gmail.com>:

> No, 10,000 processes will not do.

I never suggested that. In fact, I'm on the record recommending about
two processes per CPU core.

There are many principles on which to allocate threads/processes:
objects, tasks, stimulus sources, CPUs. I'm advocating CPUs.

If you use nonblocking primitives, a single process is enough to keep a
CPU busy and the throughput high. With multiple processors, you need
more processes, but generally not more than one. You should give the
hardware a chance to optimize the dataflows a bit so having some extra
processes is probably a good idea.


Marko

[toc] | [prev] | [next] | [standalone]

#69858

From	Roy Smith <roy@panix.com>
Date	2014-04-08 09:19 -0400
Message-ID	<roy-15C10F.09190508042014@news.panix.com>
In reply to	#69851

In article <mailman.9008.1396954078.18130.python-list@python.org>,
 Sturla Molden <sturla.molden@gmail.com> wrote:

> The problem here is the belief that "thread-safety cannot be abstracted
> out". It can. The solution is to share nothing and send messages through
> queues.

Thread 1 and Thread 2 use a pair of queues to communicate.  T1 sends 
work to T2 using Q1, and T2 sends back results using Q2.

T1 pushes Item1 onto Q1, and waits for Result1 to come back on Q2.

T2 reads Item1 from its end of Q1, and waits to read Item2, which it 
needs to compute Result1.

Sounds like a deadlock to me.

[toc] | [prev] | [next] | [standalone]

#69877

From	Sturla Molden <sturla.molden@gmail.com>
Date	2014-04-08 15:44 +0000
Message-ID	<mailman.9024.1396971907.18130.python-list@python.org>
In reply to	#69858

Roy Smith <roy@panix.com> wrote:

> Thread 1 and Thread 2 use a pair of queues to communicate.  T1 sends 
> work to T2 using Q1, and T2 sends back results using Q2.
> 
> T1 pushes Item1 onto Q1, and waits for Result1 to come back on Q2.
> 
> T2 reads Item1 from its end of Q1, and waits to read Item2, which it 
> needs to compute Result1.
> 
> Sounds like a deadlock to me.

As it turns out, if you try hard enough, you can always construct a race
condition, deadlock or a livelock. If you need to guard against it, there
is paradigms like BSP, but not everything fits in. a BSP design.

Sturla

[toc] | [prev] | [next] | [standalone]

#69880

From	Paul Rubin <no.email@nospam.invalid>
Date	2014-04-08 09:38 -0700
Message-ID	<7x4n23dbdy.fsf@ruckus.brouhaha.com>
In reply to	#69877

Sturla Molden <sturla.molden@gmail.com> writes:
> As it turns out, if you try hard enough, you can always construct a race
> condition, deadlock or a livelock. If you need to guard against it, there
> is paradigms like BSP, but not everything fits in. a BSP design.

Software transactional memory (STM) may also be of interest, though
it's not that great a fit with Python.

[toc] | [prev] | [next] | [standalone]

#69949

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2014-04-09 14:42 +0100
Message-ID	<mailman.9074.1397051105.18130.python-list@python.org>
In reply to	#69880

On 08/04/2014 17:38, Paul Rubin wrote:
> Sturla Molden <sturla.molden@gmail.com> writes:
>> As it turns out, if you try hard enough, you can always construct a race
>> condition, deadlock or a livelock. If you need to guard against it, there
>> is paradigms like BSP, but not everything fits in. a BSP design.
>
> Software transactional memory (STM) may also be of interest, though
> it's not that great a fit with Python.
>

The pypy folks have been looking at this see 
http://pypy.readthedocs.org/en/latest/stm.html

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com

[toc] | [prev] | [next] | [standalone]

#69945

From	"Frank Millman" <frank@chagford.com>
Date	2014-04-09 15:23 +0200
Message-ID	<mailman.9071.1397049831.18130.python-list@python.org>
In reply to	#69832

"Marko Rauhamaa" <marko@pacujo.net> wrote in message 
news:877g70wg8p.fsf@elektro.pacujo.net...
> Dennis Lee Bieber <wlfraed@ix.netcom.com>:
>
>> That's been my experience too... Threading works for me... My
>> attempts at so called asyncio (whatever language) have always led to
>> my having to worry about losing data if some handler takes too long to
>> return.
>>
>> To me, asyncio is closer to a polling interrupt handler, and I
>> still need a thread to handle the main processing.
>
> Yes, asynchronous processing results in complex, event-driven state
> machines that can be hard to get right. However, my experience is that
> that's the lesser evil.
>
> About a handler taking too long: you need to guard each state with a
> timer. Also, you need then to handle the belated handler after the timer
> has expired.
>

Can I ask a newbie question here?

I understand that, if one uses threading, each thread *can* block without 
affecting other threads, whereas if one uses the async approach, a request 
handler must *not* block, otherwise it will hold up the entire process and 
not allow other requests to be handled.

How does one distinguish betwen 'blocking' and 'non-blocking'? Is it 
either/or, or is it some arbitrary timeout - if a handler returns within 
that time it is non-blocking, but if it exceeds it it is blocking?

In my environment, most requests involve a database lookup. I endeavour to 
ensure that a response is returned quickly (however one defines quickly) but 
I cannot guarantee it if the database server is under stress. Is this a good 
candidate for async, or not?

Thanks for any insights.

Frank Millman

[toc] | [prev] | [next] | [standalone]

#69952

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-04-09 16:55 +0300
Message-ID	<87lhveobeq.fsf@elektro.pacujo.net>
In reply to	#69945

"Frank Millman" <frank@chagford.com>:

> I understand that, if one uses threading, each thread *can* block
> without affecting other threads, whereas if one uses the async
> approach, a request handler must *not* block, otherwise it will hold
> up the entire process and not allow other requests to be handled.

Yes.

> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it 
> either/or, or is it some arbitrary timeout - if a handler returns within 
> that time it is non-blocking, but if it exceeds it it is blocking?

Old-school I/O primitives are blocking by default. Nonblocking I/O is
enabled with the setblocking() method.

In the new asyncio package, I/O is nonblocking by default (I'm sure, but
didn't verify).

> In my environment, most requests involve a database lookup. I
> endeavour to ensure that a response is returned quickly (however one
> defines quickly) but I cannot guarantee it if the database server is
> under stress. Is this a good candidate for async, or not?

Database libraries are notoriously bad for nonblocking I/O. It's nothing
fundamental; it's only that the library writers couldn't appreciate the
worth of async communication. For that, asyncio provides special
support:

   <URL: https://docs.python.org/3.4/library/asyncio-dev.html#
   handle-blocking-functions-correctly>



Marko

[toc] | [prev] | [next] | [standalone]

#69957

From	"Frank Millman" <frank@chagford.com>
Date	2014-04-09 16:46 +0200
Message-ID	<mailman.9080.1397054818.18130.python-list@python.org>
In reply to	#69952

"Marko Rauhamaa" <marko@pacujo.net> wrote in message 
news:87lhveobeq.fsf@elektro.pacujo.net...
> "Frank Millman" <frank@chagford.com>:
>
>> I understand that, if one uses threading, each thread *can* block
>> without affecting other threads, whereas if one uses the async
>> approach, a request handler must *not* block, otherwise it will hold
>> up the entire process and not allow other requests to be handled.
>
> Yes.
>
>> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
>> either/or, or is it some arbitrary timeout - if a handler returns within
>> that time it is non-blocking, but if it exceeds it it is blocking?
>
> Old-school I/O primitives are blocking by default. Nonblocking I/O is
> enabled with the setblocking() method.
>
> In the new asyncio package, I/O is nonblocking by default (I'm sure, but
> didn't verify).
>
>> In my environment, most requests involve a database lookup. I
>> endeavour to ensure that a response is returned quickly (however one
>> defines quickly) but I cannot guarantee it if the database server is
>> under stress. Is this a good candidate for async, or not?
>
> Database libraries are notoriously bad for nonblocking I/O. It's nothing
> fundamental; it's only that the library writers couldn't appreciate the
> worth of async communication. For that, asyncio provides special
> support:
>
>   <URL: https://docs.python.org/3.4/library/asyncio-dev.html#
>   handle-blocking-functions-correctly>
>

Thanks for the reply, Marko.

I did have a look at the link you provided, but at the moment my brain is 
'blocking' on this, so I will need to read it a few times to get back into 
'async' mode.

As I asked Chris, (so you don't have to respond if he already has), I am 
finding difficulty in understanding the benefit of going async in my case. 
If most requests require a blocking handler, it seems that I might as well 
stick with each request being handled by a thread, independent of all other 
threads.

Frank

[toc] | [prev] | [next] | [standalone]

#69968

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-04-09 20:31 +0300
Message-ID	<87k3ayl893.fsf@elektro.pacujo.net>
In reply to	#69957

"Frank Millman" <frank@chagford.com>:

> I am finding difficulty in understanding the benefit of going async in
> my case. If most requests require a blocking handler, it seems that I
> might as well stick with each request being handled by a thread,
> independent of all other threads.

When the underlying facilities only provide blocking access, you are
forced to use threads (or processes).

One area where asynchronous programming was always the method of choice
is graphical user interfaces. The GUI of an application must always be
responsive to the user and must be prepared to handle any of numerous
stimuli.

Network protocol layers are also usually implemented asynchronously. The
protocol standards read like asynchronous programs so the translation
into executable programs is most natural in the asynchronous style.
Here, too, the networking entities must be ready for different stimuli
in any state, so threads are usually not a good fit.

Kernel programming makes use of threads and processes. However, the
asynchronous style is there in a big way in the form of interrupt
handlers, hooks and system calls.

Really, the threading model is only good for a relatively small subset
of programming objectives, and over the lifetime of the solution, you
will often come to realize threading wasn't that good a fit after all.
Namely, in any given state, you will have to be prepared to handle more
than one stimulus. Also, over time you will learn to dread the race
conditions that are endemic in thread programming. Those are the kinds
of problems that make you check out the current job postings. Only
there's no escape: in your next job, they are going to make you find and
fix the race conditions in your predecessor's code.


Marko

[toc] | [prev] | [next] | [standalone]

#69971

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-10 03:52 +1000
Message-ID	<mailman.9088.1397065933.18130.python-list@python.org>
In reply to	#69968

On Thu, Apr 10, 2014 at 3:31 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Really, the threading model is only good for a relatively small subset
> of programming objectives, and over the lifetime of the solution, you
> will often come to realize threading wasn't that good a fit after all.
> Namely, in any given state, you will have to be prepared to handle more
> than one stimulus. Also, over time you will learn to dread the race
> conditions that are endemic in thread programming. Those are the kinds
> of problems that make you check out the current job postings. Only
> there's no escape: in your next job, they are going to make you find and
> fix the race conditions in your predecessor's code.

People with a fear of threaded programming almost certainly never grew
up on OS/2. :) I learned about GUI programming thus: Write your
synchronous message handler to guarantee that it will return in an
absolute maximum of 0.1s, preferably a lot less. If you have any sort
of heavy processing to do, spin off a thread. It was simply the normal
way to do things. Normal handling was done on Thread 0, and two
sequential events would be processed sequentially on that thread (so
if your handler for the Enter keypress message clears out an entry
field, the next key pressed is guaranteed to happen on an empty
field), and everything else, it's considered normal to spawn threads
and let them run to completion.

ChrisA

[toc] | [prev] | [next] | [standalone]

#70036

From	Mark H Harris <harrismh777@gmail.com>
Date	2014-04-10 08:29 -0500
Message-ID	<mailman.9134.1397136592.18130.python-list@python.org>
In reply to	#69971

On 4/9/14 12:52 PM, Chris Angelico wrote:

> People with a fear of threaded programming almost certainly never grew
> up on OS/2. :) I learned about GUI programming thus: Write your
> synchronous message handler to guarantee that it will return in an
> absolute maximum of 0.1s, preferably a lot less. If you have any sort
> of heavy processing to do, spin off a thread.

heh  very true.

Any non trivial OS/2 GUI app required threads.  We had a template at our 
shop that we gave to noobs for copy-n-tweak.  It had not only the basics 
for getting the canvas on the screen with a tool bar and a button, but 
also the minimal code required to setup the thread to handle the button 
event (it was a database lookup in our case).

[toc] | [prev] | [next] | [standalone]

#69973

From	Sturla Molden <sturla.molden@gmail.com>
Date	2014-04-09 19:20 +0000
Message-ID	<mailman.9090.1397071239.18130.python-list@python.org>
In reply to	#69968

Chris Angelico <rosuav@gmail.com> wrote:

> People with a fear of threaded programming almost certainly never grew
> up on OS/2. :) I learned about GUI programming thus: Write your
> synchronous message handler to guarantee that it will return in an
> absolute maximum of 0.1s, preferably a lot less. If you have any sort
> of heavy processing to do, spin off a thread. It was simply the normal
> way to do things. 

That is still the best way to do it, IMHO. 

As I recall, on BeOS the operating system would even spawn a new thread to
handle each GUI event. Pervasive multithreading is great for creating
responsive user interfaces and running multimedia.

Sturla

[toc] | [prev] | [next] | [standalone]

#69953

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-09 23:47 +1000
Message-ID	<mailman.9077.1397051723.18130.python-list@python.org>
In reply to	#69832

On Wed, Apr 9, 2014 at 11:23 PM, Frank Millman <frank@chagford.com> wrote:
> Can I ask a newbie question here?

You certainly can!

> I understand that, if one uses threading, each thread *can* block without
> affecting other threads, whereas if one uses the async approach, a request
> handler must *not* block, otherwise it will hold up the entire process and
> not allow other requests to be handled.

That would be correct.

> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
> either/or, or is it some arbitrary timeout - if a handler returns within
> that time it is non-blocking, but if it exceeds it it is blocking?

No; a blocking request is one that waits until it has a response, and
a non-blocking request is one that goes off and does something, and
then comes back to you when it's done. When you turn on the kettle,
you can either stay there and watch until it's ready to make your
coffee (or, in my case, hot chocolate), or you can go away and come
back when it whistles at you to say that it's boiling. A third option,
polling, is when you put a pot of water on the stove, turn it on, and
then come back periodically to see if it's boiling yet. As the old
saying tells us, blocking I/O is a bad idea with pots of water,
because it'll never return.

> In my environment, most requests involve a database lookup. I endeavour to
> ensure that a response is returned quickly (however one defines quickly) but
> I cannot guarantee it if the database server is under stress. Is this a good
> candidate for async, or not?

No, that's a bad idea, because you have blocking I/O. If you have
multiple threads, it's fine, because the thread that's waiting for the
database will be blocked, and other threads can run (you may need to
ensure that you have separate database connections for your separate
threads); but in an asynchronous system, you want to be able to go and
do something else while you're waiting. Something like this:

def blocking_database_query(id):
    print("Finding out who employee #%d is..."%id)
    res = db.query("select name from emp where id=12345")
    print("Employee #%d is %s."%(id,res[0].name))

def nonblocking_query(id):
    print("Finding out who employee #%d is..."%id)
    def nextstep(res):
        print("Employee #%d is %s."%(id,res[0].name))
    db.asyncquery(nextstep, "select name from emp where id=12345")

This is a common way to do asynchronous I/O. Instead of saying "Do
this and give me a result", you say "Do this, and when you have a
result, call this function". Then as soon as you've done that, you
return (to some main loop, probably). It's usually a bit more
complicated than this (eg you might need multiple callbacks or
additional arguments in case it times out or otherwise fails - there's
no way to throw an exception into a callback, the way the blocking
query could throw something instead of returning), but that's the
basic concept.

You may be able to get away with doing blocking operations in
asynchronous mode, if you're confident they'll be fairly fast. But you
have to be really REALLY confident, and it does create assumptions
that can be wrong. For instance, the above code assumes that print()
won't block. You might think "Duh, how can printing to the screen
block?!?", but if your program's output is being piped into something
else, it most certainly can :) If that were writing to a remote
socket, though, it'd be better to perform those operations
asynchronously too: attempt to write to the socket; once that's done,
start the database query; when the database result arrives, write the
response to the socket; when that's done, go back to some main loop.

ChrisA

[toc] | [prev] | [next] | [standalone]

#69955

From	Roy Smith <roy@panix.com>
Date	2014-04-09 10:44 -0400
Message-ID	<roy-128667.10443809042014@news.panix.com>
In reply to	#69953

In article <mailman.9077.1397051723.18130.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> For instance, the above code assumes that print() won't block. You 
> might think "Duh, how can printing to the screen block?!?", but if 
> your program's output is being piped into something else, it most 
> certainly can :)

<nostalgia-mode>

Heh.  One day, a long time ago, I had to investigate why our Vax-11/750 
had crashed.  I vaguely recollect being called at home on a weekend and 
having to schlepp into work, but my memory may just be running in 
auto-story-embelishment mode.

Turns out, the machine hadn't really crashed, but it was hung.  The 
console was a LA-120 (http://tinyurl.com/mljyegv), on which was printed 
various log messages from time to time.  It had run out of paper, which 
was detected by the little out-of-paper microswitch, so it stopped 
printing.  When its input buffer got full, it sent a control-s, which 
tells the thing on the other end of the serial line to stop sending.  
Which of course caused the kernel tty driver output buffer to fill, 
which eventually caused all the print statements by the various system 
loggers to block, and eventually the whole mess ground to a halt.

I put in new paper.  The printer proceeded to spit out several hours 
worth of buffered log messages and the system picked up where it left 
off.

</nostalgia-mode>

At Songza, we've been using gevent to do asynchronous I/O.  It's an 
interesting concept.  Basically, you write your application code as you 
normally would, using blocking I/O calls.  Gevent then monkey-patches 
the heck out of the Python library to intercept every call that could 
possibly block and splice it into a asynchronous task scheduler 
framework.  The amazing thing is that it works.  It let us reduce the 
number of gunicorn worker processes we use by a factor of 6, and handle 
the same amount of traffic.

Of course, monkey-patching is black magic, and sometimes we get hit by 
really bizarre and difficult to track down bugs.  But, to go back to my 
"technology evolution" scale, gevent is somewhere between avant-garde 
and what the kewl kids are using (version 1.0 was released in December), 
so that shouldn't be too surprising.

[toc] | [prev] | [next] | [standalone]

#69954

From	"Frank Millman" <frank@chagford.com>
Date	2014-04-09 16:30 +0200
Message-ID	<mailman.9078.1397053837.18130.python-list@python.org>
In reply to	#69832

"Chris Angelico" <rosuav@gmail.com> wrote in message 
news:CAPTjJmqwhb8O8vq84mMTv+-Rkc3Ff1AQDXe5cs8Y5gY02kHyNg@mail.gmail.com...
> On Wed, Apr 9, 2014 at 11:23 PM, Frank Millman <frank@chagford.com> wrote:
>
>> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
>> either/or, or is it some arbitrary timeout - if a handler returns within
>> that time it is non-blocking, but if it exceeds it it is blocking?
>
> No; a blocking request is one that waits until it has a response, and
> a non-blocking request is one that goes off and does something, and
> then comes back to you when it's done.

Does reading from disk count as blocking? Strictly speaking I would have 
thought 'yes'.

In other words, non-blocking implies that everything required to pass off 
the request to a handler and be ready to deal with the next one must already 
be in memory, and it must not rely on communicating with any outside 
resource at all. Is this correct?

>
> def blocking_database_query(id):
>    print("Finding out who employee #%d is..."%id)
>    res = db.query("select name from emp where id=12345")
>    print("Employee #%d is %s."%(id,res[0].name))
>
> def nonblocking_query(id):
>    print("Finding out who employee #%d is..."%id)
>    def nextstep(res):
>        print("Employee #%d is %s."%(id,res[0].name))
>    db.asyncquery(nextstep, "select name from emp where id=12345")
>

In this example, what is 'db.asyncquery'?

If you mean that you have a separate thread to handle database queries, and 
you use a queue or other message-passing mechanism to hand it the query and 
get the result, then I understand it. If not, can you explain in more 
detail.

If I have understood correctly, then is there any benefit at all in my going 
async? I might as well just stick with threads for the request handling as 
well as the database handling.

Frank

[toc] | [prev] | [next] | [standalone]

#69958

From	Roy Smith <roy@panix.com>
Date	2014-04-09 10:52 -0400
Message-ID	<roy-070ADD.10524309042014@news.panix.com>
In reply to	#69954

In article <mailman.9078.1397053837.18130.python-list@python.org>,
 "Frank Millman" <frank@chagford.com> wrote:

> "Chris Angelico" <rosuav@gmail.com> wrote in message 
> news:CAPTjJmqwhb8O8vq84mMTv+-Rkc3Ff1AQDXe5cs8Y5gY02kHyNg@mail.gmail.com...
> > On Wed, Apr 9, 2014 at 11:23 PM, Frank Millman <frank@chagford.com> wrote:
> >
> >> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
> >> either/or, or is it some arbitrary timeout - if a handler returns within
> >> that time it is non-blocking, but if it exceeds it it is blocking?
> >
> > No; a blocking request is one that waits until it has a response, and
> > a non-blocking request is one that goes off and does something, and
> > then comes back to you when it's done.
> 
> Does reading from disk count as blocking? Strictly speaking I would have 
> thought 'yes'.

Of course it does.  But, the bigger question is, "What counts as reading 
from disk?"

In the old days, all Unix system calls were divided up into two groups, 
based on whether they were "fast" or "slow".  Processes executing a 
"fast" system call would block, and could not be interrupted; i.e. any 
signals delivered to them would be queued up and delivered after the 
system call had finished.  Typically, that meant, if you typed 
Control-C, your process wouldn't get killed until the system call it was 
executing completed.

Disk reads were considered fast.  You type Control-C, the read takes 
another few ms to finish, then your process gets whacked.  You never 
even notice the delay.  But, a read on a tty was slow.  It would sit 
there forever, until you hit return.  Slow system calls got interrupted.

Then, along came the network, and everything got confusing.  If I open a 
file that lives on an NFS server, and read from it, am I doing a disk 
read?  Should I be able to interrupt an NFS operation?

[toc] | [prev] | [next] | [standalone]

#69987

From	Gregory Ewing <greg.ewing@canterbury.ac.nz>
Date	2014-04-10 11:19 +1200
Message-ID	<bqm2siFjumcU1@mid.individual.net>
In reply to	#69958

Roy Smith wrote:
> In the old days, all Unix system calls were divided up into two groups, 
> based on whether they were "fast" or "slow".  Processes executing a 
> "fast" system call would block, and could not be interrupted;

That doesn't really have anything to do with blocking vs.
non-blocking, though. The system call blocks in both cases;
the only difference is whether the kernel bothers to allow
for aborting the blocked operation part way through. The
calling process doesn't see any difference.

-- 
Greg

[toc] | [prev] | [next] | [standalone]

#69966

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-04-09 19:48 +0300
Message-ID	<87ob0ala99.fsf@elektro.pacujo.net>
In reply to	#69954

"Frank Millman" <frank@chagford.com>:

> Does reading from disk count as blocking? Strictly speaking I would
> have thought 'yes'.

You have touched upon a very interesting topic there.

I can't speak for Windows, but linux doesn't really let you control the
blocking of disk access. In fact, linux doesn't really let you control
the blocking of RAM access, either. RAM and the disk are considered two
sides of the same coin. It is very difficult to guarantee that a process
has all of its memory "cached" in RAM short of not having a physical
disk mounted.

There is what's known as AIO, and it's supposedly supported in linux,
but I have never seen anybody use it, and I don't know how well tested
it is. Also, I don't know how well it integrates with regular asyncio.

On the other hand, you don't know if disk access ever blocks. Quite
often you will find that the whole active part of the file system is
kept in RAM by linux.

My rule of thumb, two processes per CPU, should alleviate disk blocking
issues. When that isn't enough, you may be forced to write a small file
server that translates disk access to socket/pipe access.

Sockets and pipes are different beasts because, unlike files, they are
allocated memory buffers in the kernel memory. Also, they are accessed
strictly sequentially while files can be sought back and forth.

I do think it would be a nice addition to linux if they added a, say,
AF_FILE socket type that provided a buffered socket abstraction for
physical files.


Marko

[toc] | [prev] | [next] | [standalone]

#69956

From	Chris Angelico <rosuav@gmail.com>
Date	2014-04-10 00:44 +1000
Message-ID	<mailman.9079.1397054695.18130.python-list@python.org>
In reply to	#69832

On Thu, Apr 10, 2014 at 12:30 AM, Frank Millman <frank@chagford.com> wrote:
>
> "Chris Angelico" <rosuav@gmail.com> wrote in message
> news:CAPTjJmqwhb8O8vq84mMTv+-Rkc3Ff1AQDXe5cs8Y5gY02kHyNg@mail.gmail.com...
>> On Wed, Apr 9, 2014 at 11:23 PM, Frank Millman <frank@chagford.com> wrote:
>>
>>> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
>>> either/or, or is it some arbitrary timeout - if a handler returns within
>>> that time it is non-blocking, but if it exceeds it it is blocking?
>>
>> No; a blocking request is one that waits until it has a response, and
>> a non-blocking request is one that goes off and does something, and
>> then comes back to you when it's done.
>
> Does reading from disk count as blocking? Strictly speaking I would have
> thought 'yes'.

What the operation actually *is* is quite immaterial. Reading from the
disk can be done as blocking or non-blocking; if you simply open a
file and call its read() method, that'll block by default (the open()
call can block, too, but you asked about reading). When you ask for
something to be done and expect a return value with the result, that
implies a blocking operation. Even simple work like floating-point
addition can be blocking or nonblocking; I learned some of the basics
of the 8087 coprocessor and how to do assembly-language floating
point, and it was, by default, a non-blocking operation - you say "Go
do this", and then you say "FWAIT" to block until the coprocessor is
done. But normally you want to be able to write this:

a = 12.3
b = 45.6
c = a + b
print("The sum is:",c)

rather than this:

add_async(a, b, lambda c: print("The sum is:",c))

> In other words, non-blocking implies that everything required to pass off
> the request to a handler and be ready to deal with the next one must already
> be in memory, and it must not rely on communicating with any outside
> resource at all. Is this correct?

Nope. It simply requires that communicating with an outside resource
be done asynchronously. You fire off the request and either check for
its completion periodically (polling) or get a notification when it's
done (callback, usually).

>> def nonblocking_query(id):
>>    print("Finding out who employee #%d is..."%id)
>>    def nextstep(res):
>>        print("Employee #%d is %s."%(id,res[0].name))
>>    db.asyncquery(nextstep, "select name from emp where id=12345")
>>
>
> In this example, what is 'db.asyncquery'?
>
> If you mean that you have a separate thread to handle database queries, and
> you use a queue or other message-passing mechanism to hand it the query and
> get the result, then I understand it. If not, can you explain in more
> detail.

It's an imaginary function that would send a request to the database,
and then call some callback function when the result arrives. If the
database connection is via a TCP/IP socket, that could be handled by
writing the query to the socket, and then when data comes back from
the socket, looking up the callback and calling it. There's no
additional thread here.

> If I have understood correctly, then is there any benefit at all in my going
> async? I might as well just stick with threads for the request handling as
> well as the database handling.

Threads are a convenient way to handle things, but they're an
alternative. You generally do one or the other. Occasionally you'll
use a bit of both, maybe because your database handler can't work
asynchronously, but ideally you shouldn't need to mix and match like
that.

I'm oversimplifying horribly, here, but hopefully helpfully :)

ChrisA

[toc] | [prev] | [next] | [standalone]

#69959

From	Sturla Molden <sturla.molden@gmail.com>
Date	2014-04-09 15:29 +0000
Message-ID	<mailman.9081.1397057389.18130.python-list@python.org>
In reply to	#69832

"Frank Millman" <frank@chagford.com> wrote:

> If I have understood correctly, then is there any benefit at all in my going 
> async? I might as well just stick with threads for the request handling as 
> well as the database handling.

1. There is a scalability issue with threads, particularly if you don't
have enough RAM or use a 32 bit system.

2. Earlier Linux kernels did not perform well if they had to schedule
10,000 threads. 

3. It is nice to be able to abort a read or write that hangs (for whatever
reason). Killing a thread with pthread_cancel or TerminateThread is not
recommended.

Sturla

[toc] | [prev] | [next] | [standalone]

Page 3 of 6 — ← Prev page 1 2 [3] 4 5 6 Next page →

csiph-web

Re: threading

Contents

#69886

#69858

#69877

#69880

#69949

#69945

#69952

#69957

#69968

#69971

#70036

#69973

#69953

#69955

#69954

#69958

#69987

#69966

#69956

#69959