Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #72431 > unrolled thread

Benefits of asyncio

Started byAseem Bansal <asmbansal2@gmail.com>
First post2014-06-02 10:40 -0700
Last post2014-06-02 21:54 -0700
Articles 12 on this page of 32 — 10 participants

Back to article view | Back to comp.lang.python


Contents

  Benefits of asyncio Aseem Bansal <asmbansal2@gmail.com> - 2014-06-02 10:40 -0700
    Re: Benefits of asyncio Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-02 12:37 -0600
    Re: Benefits of asyncio Terry Reedy <tjreedy@udel.edu> - 2014-06-02 16:07 -0400
      Re: Benefits of asyncio Roy Smith <roy@panix.com> - 2014-06-02 16:19 -0400
      Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-02 23:28 +0300
        Re: Benefits of asyncio Paul Rubin <no.email@nospam.invalid> - 2014-06-02 13:45 -0700
          Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 07:49 +1000
          Re: Benefits of asyncio Terry Reedy <tjreedy@udel.edu> - 2014-06-02 21:51 -0400
          Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 09:36 +0300
            Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 18:47 +1000
              Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 12:10 +0300
                Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 19:30 +1000
                  Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 13:08 +0300
                    Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 20:23 +1000
                      Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 14:12 +0300
                        Re: Benefits of asyncio Paul Rubin <no.email@nospam.invalid> - 2014-06-04 00:52 -0700
                Re: Benefits of asyncio Burak Arslan <burak.arslan@arskom.com.tr> - 2014-06-03 14:05 +0300
                Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 21:57 +1000
                Re: Benefits of asyncio Burak Arslan <burak.arslan@arskom.com.tr> - 2014-06-04 08:10 +0300
                Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-04 17:30 +1000
                Re: Benefits of asyncio Paul Rubin <no.email@nospam.invalid> - 2014-06-04 00:48 -0700
            Re: Benefits of asyncio "Frank Millman" <frank@chagford.com> - 2014-06-03 13:09 +0200
            Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 22:01 +1000
              Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 16:05 +0300
                Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 23:31 +1000
                  Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 16:42 +0300
                    Re: Benefits of asyncio Chris Angelico <rosuav@gmail.com> - 2014-06-03 23:49 +1000
                      Re: Benefits of asyncio Marko Rauhamaa <marko@pacujo.net> - 2014-06-03 19:18 +0300
                Re: Benefits of asyncio Roy Smith <roy@panix.com> - 2014-06-03 11:40 -0400
          Re: Benefits of asyncio Paul Sokolovsky <pmiscml@gmail.com> - 2014-06-03 11:31 +0300
    Re: Benefits of asyncio Burak Arslan <burak.arslan@arskom.com.tr> - 2014-06-03 00:07 +0300
    Re: Benefits of asyncio Aseem Bansal <asmbansal2@gmail.com> - 2014-06-02 21:54 -0700

Page 2 of 2 — ← Prev page 1 [2]


#72612

FromPaul Rubin <no.email@nospam.invalid>
Date2014-06-04 00:48 -0700
Message-ID<7xwqcxw1xa.fsf@ruckus.brouhaha.com>
In reply to#72494
Marko Rauhamaa <marko@pacujo.net> writes:
> That's a good reason to avoid threads. Once you realize you would have
> been better off with an async approach, you'll have to start over.

That just hasn't happened to me yet, at least in terms of program
organization.  Python threads get too slow once there are too many
tasks, but that's just an implementation artifact of Python threads, and
goes along with Python being slow in general.  Write threaded code in
GHC or Erlang or maybe Go, and you can handle millions of connections,
as the threads are in userspace and are very lightweight and fast.

http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask035-voellmy.pdf

[toc] | [prev] | [next] | [standalone]


#72505

From"Frank Millman" <frank@chagford.com>
Date2014-06-03 13:09 +0200
Message-ID<mailman.10615.1401793780.18130.python-list@python.org>
In reply to#72481
"Chris Angelico" <rosuav@gmail.com> wrote in message 
news:CAPTjJmqWkEStvrsrg30qjO+4TtLqfK9Q4GaByGovEw8NsdXzPg@mail.gmail.com...
>
> This works as long as your database is reasonably fast and close
> (common case for a lot of web servers: DB runs on same computer as web
> and application and etc servers). It's nice and simple, lets you use a
> single database connection (although you should probably wrap it in a
> try/finally to ensure that you roll back on any exception), and won't
> materially damage throughput as long as you don't run into problems.
> For a database driven web site, most of the I/O time will be waiting
> for clients, not waiting for your database.
>
> Getting rid of those blocking database calls means having multiple
> concurrent transactions on the database. Whether you go async or
> threaded, this is going to happen. Unless your database lets you run
> multiple simultaneous transactions on a single connection (I don't
> think the Python DB API allows that, and I can't think of any DB
> backends that support it, off hand), that means that every single
> concurrency point needs its own database connection. With threads, you
> could have a pool of (say) a dozen or so, one per thread, with each
> one working synchronously; with asyncio, you'd have to have one for
> every single incoming client request, or else faff around with
> semaphores and resource pools and such manually. The throughput you
> gain by making those asynchronous with callbacks is quite probably
> destroyed by the throughput you lose in having too many simultaneous
> connections to the database. I can't prove that, obviously, but I do
> know that PostgreSQL requires up-front RAM allocation based on the
> max_connections setting, and trying to support 5000 connections
> started to get kinda stupid.
>

I am following this with interest.  I still struggle to get my head around 
the concepts, but it is slowly coming clearer.

Focusing on PostgreSQL, couldn't you do the following?

PostgreSQL runs client/server (they call it front-end/back-end) over TCP/IP.

psycopg2 appears to have some support for async communication with the 
back-end. I only skimmed the docs, and it looks a bit complicated, but it is 
there.

So why not keep a 'connection pool', and for every potentially blocking 
request, grab a connection, set up a callback or a 'yield from' to wait for 
the response, and unblock.

Provided the requests return quickly, I would have thought a hundred 
database connections could support thousands of users.

Frank Millman


[toc] | [prev] | [next] | [standalone]


#72510

FromChris Angelico <rosuav@gmail.com>
Date2014-06-03 22:01 +1000
Message-ID<mailman.10618.1401796902.18130.python-list@python.org>
In reply to#72481
On Tue, Jun 3, 2014 at 9:09 PM, Frank Millman <frank@chagford.com> wrote:
> So why not keep a 'connection pool', and for every potentially blocking
> request, grab a connection, set up a callback or a 'yield from' to wait for
> the response, and unblock.

Compare against a thread pool, where each thread simply does blocking
requests. With threads, you use blocking database, blocking logging,
blocking I/O, etc, and everything *just happens*; with a connection
pool, like this, you need to do every single one of them separately.
(How many of you have ever written non-blocking error logging? Or have
you written a non-blocking system with blocking calls to write to your
error log? The latter is far FAR more common, but all files, even
stdout/stderr, can block.) I don't see how Marko's assertion that
event-driven asynchronous programming is a breath of fresh air
compared with multithreading. The only way multithreading can possibly
be more complicated is that preemption can occur anywhere - and that's
exactly one of the big flaws in async work, if you don't do your job
properly.

ChrisA

[toc] | [prev] | [next] | [standalone]


#72513

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-06-03 16:05 +0300
Message-ID<87ha42uos2.fsf@elektro.pacujo.net>
In reply to#72510
Chris Angelico <rosuav@gmail.com>:

> I don't see how Marko's assertion that event-driven asynchronous
> programming is a breath of fresh air compared with multithreading. The
> only way multithreading can possibly be more complicated is that
> preemption can occur anywhere - and that's exactly one of the big
> flaws in async work, if you don't do your job properly.

Say you have a thread blocking on socket.accept(). Another thread
receives the management command to shut the server down. How do you tell
the socket.accept() thread to abort and exit?

The classic hack is close the socket, which causes the blocking thread
to raise an exception.

The blocking thread might be also stuck in socket.recv(). Closing the
socket from the outside is dangerous now because of race conditions. So
you will have to carefully use add locking to block an unwanted closing
of the connection.

But what do you do if the blocking thread is stuck in the middle of a
black box API that doesn't expose a file you could close?

So you hope all blocking APIs have a timeout parameter. You then replace
all blocking calls with polling loops. You make the timeout value long
enough not to burden the CPU too much and short enough not to annoy the
human operator too much.

Well, ok,

   os.kill(os.getpid(), signal.SIGKILL)

is always an option.


Marko

[toc] | [prev] | [next] | [standalone]


#72515

FromChris Angelico <rosuav@gmail.com>
Date2014-06-03 23:31 +1000
Message-ID<mailman.10622.1401802267.18130.python-list@python.org>
In reply to#72513
On Tue, Jun 3, 2014 at 11:05 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> I don't see how Marko's assertion that event-driven asynchronous
>> programming is a breath of fresh air compared with multithreading. The
>> only way multithreading can possibly be more complicated is that
>> preemption can occur anywhere - and that's exactly one of the big
>> flaws in async work, if you don't do your job properly.
>
> Say you have a thread blocking on socket.accept(). Another thread
> receives the management command to shut the server down. How do you tell
> the socket.accept() thread to abort and exit?
>
> The classic hack is close the socket, which causes the blocking thread
> to raise an exception.

How's that a hack? If you're shutting the server down, you need to
close the listening socket anyway, because otherwise clients will
think they can get in. Yes, I would close the socket. Or just send the
process a signal like SIGINT, which will break the accept() call. (I
don't know about Python specifically here; the underlying Linux API
works this way, returning EINTR, as does OS/2 which is where I
learned. Generally I'd have the accept() loop as the process's main
loop, and spin off threads for clients.) In fact, the most likely case
I'd have would be that the receipt of that signal *is* the management
command to shut the server down; it might be SIGINT or SIGQUIT or
SIGTERM, or maybe some other signal, but one of the easiest ways to
notify a Unix process to shut down is to send it a signal. Coping with
broken proprietary platforms is an exercise for the reader, but I know
it's possible to terminate a console-based socket accept loop in
Windows with Ctrl-C, so there ought to be an equivalent API method.

> The blocking thread might be also stuck in socket.recv(). Closing the
> socket from the outside is dangerous now because of race conditions. So
> you will have to carefully use add locking to block an unwanted closing
> of the connection.

Maybe. More likely, the same situation applies - you're shutting down,
so you need to close the socket anyway. I've generally found -
although this may not work on all platforms - that it's perfectly safe
for one thread to be blocked in recv() while another thread calls
send() on the same socket, and then closes that socket.  On the other
hand, if your notion of shutting down does NOT include closing the
socket, then you have to deal with things some other way - maybe
handing the connection on to some other process, or something - so a
generic approach isn't appropriate here.

> But what do you do if the blocking thread is stuck in the middle of a
> black box API that doesn't expose a file you could close?
>
> So you hope all blocking APIs have a timeout parameter.

No! I never put timeouts on blocking calls to solve shutdown problems.
That is a hack, and a bad one. Timeouts should be used only when the
timeout is itself significant (eg if you decide that your socket
connections should time out if there's no activity in X minutes, so
you put a timeout on socket reads of X*60000 and close the connection
cleanly if it times out).

> Well, ok,
>
>    os.kill(os.getpid(), signal.SIGKILL)
>
> is always an option.

Yeah, that's one way. More likely, you'll find that a lesser signal
also aborts the blocking API call. And even if you have to hope for an
alternate API to solve this problem, how is that different from hoping
that all blocking APIs have corresponding non-blocking APIs? I
reiterate the example I've used a few times already:

https://docs.python.org/3.4/library/logging.html#logging.Logger.debug

What happens if that blocks? How can you make sure it won't?

ChrisA

[toc] | [prev] | [next] | [standalone]


#72516

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-06-03 16:42 +0300
Message-ID<87d2equn23.fsf@elektro.pacujo.net>
In reply to#72515
Chris Angelico <rosuav@gmail.com>:

> https://docs.python.org/3.4/library/logging.html#logging.Logger.debug
>
> What happens if that blocks? How can you make sure it won't?

I haven't used that class. Generally, Python standard libraries are not
readily usable for nonblocking I/O.

For myself, I have solved that particular problem my own way.


Marko

[toc] | [prev] | [next] | [standalone]


#72519

FromChris Angelico <rosuav@gmail.com>
Date2014-06-03 23:49 +1000
Message-ID<mailman.10624.1401803369.18130.python-list@python.org>
In reply to#72516
On Tue, Jun 3, 2014 at 11:42 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> https://docs.python.org/3.4/library/logging.html#logging.Logger.debug
>>
>> What happens if that blocks? How can you make sure it won't?
>
> I haven't used that class. Generally, Python standard libraries are not
> readily usable for nonblocking I/O.
>
> For myself, I have solved that particular problem my own way.

Okay. How do you do basic logging? (Also - rolling your own logging
facilities, instead of using what Python provides, is the simpler
solution? This does not aid your case.)

ChrisA

[toc] | [prev] | [next] | [standalone]


#72531

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-06-03 19:18 +0300
Message-ID<874n027yqs.fsf@elektro.pacujo.net>
In reply to#72519
Chris Angelico <rosuav@gmail.com>:

> Okay. How do you do basic logging? (Also - rolling your own logging
> facilities, instead of using what Python provides, is the simpler
> solution? This does not aid your case.)

Asyncio is fresh out of the oven. It's going to take years before the
standard libraries catch up with it.


Marko

[toc] | [prev] | [next] | [standalone]


#72528

FromRoy Smith <roy@panix.com>
Date2014-06-03 11:40 -0400
Message-ID<roy-7A517D.11401503062014@news.panix.com>
In reply to#72513
In article <87ha42uos2.fsf@elektro.pacujo.net>,
 Marko Rauhamaa <marko@pacujo.net> wrote:

> Chris Angelico <rosuav@gmail.com>:
> 
> > I don't see how Marko's assertion that event-driven asynchronous
> > programming is a breath of fresh air compared with multithreading. The
> > only way multithreading can possibly be more complicated is that
> > preemption can occur anywhere - and that's exactly one of the big
> > flaws in async work, if you don't do your job properly.
> 
> Say you have a thread blocking on socket.accept(). Another thread
> receives the management command to shut the server down. How do you tell
> the socket.accept() thread to abort and exit?

You do the accept() in a daemon thread?

[toc] | [prev] | [next] | [standalone]


#72490

FromPaul Sokolovsky <pmiscml@gmail.com>
Date2014-06-03 11:31 +0300
Message-ID<mailman.10604.1401784310.18130.python-list@python.org>
In reply to#72442
Hello,

On Mon, 02 Jun 2014 21:51:35 -0400
Terry Reedy <tjreedy@udel.edu> wrote:

> To all the great responders. If anyone thinks the async intro is 
> inadequate and has a paragraph to contribute, open a tracker issue.

Not sure about intro (where's that?), but docs
(https://docs.python.org/3/library/asyncio.html) are pretty confusing
and bugs are reported, with no response:
http://bugs.python.org/issue21365

> 
> -- 
> Terry Jan Reedy

-- 
Best regards,
 Paul                          mailto:pmiscml@gmail.com

[toc] | [prev] | [next] | [standalone]


#72443

FromBurak Arslan <burak.arslan@arskom.com.tr>
Date2014-06-03 00:07 +0300
Message-ID<mailman.10574.1401743222.18130.python-list@python.org>
In reply to#72431
On 06/02/14 20:40, Aseem Bansal wrote:
> I read in these groups that asyncio is a great addition to Python 3. I have looked around and saw the related PEP which is quite big BTW but couldn't find a simple explanation for why this is such a great addition. Any simple example where it can be used? 

AFAIR, Guido's US Pycon 2013 keynote is where he introduced asyncio (or
tulip, which is the "internal codename" of the project) so you can watch
it to get a good idea about his motivations.

So what is Asyncio? In a nutshell, Asyncio is Python's standard event
loop. Next time you're going to build an async framework, you should
build on it instead of reimplementing it using system calls available on
the platform(s) that you're targeting, like select() or epoll().

It's great because 1) Creating an abstraction over Windows and Unix way
of event-driven programming is not trivial, 2) It makes use of "yield
from", a feature available in Python 3.3 and up. Using "yield from" is
arguably the cleanest way of doing async as it makes async code look
like blocking code which seemingly makes it easier to reason about the
flow of your logic.

The idea is very similar to twisted's @inlineCallbacks, if you're
familiar with it.

If doing lower level programming with Python is not your cup of tea, you
don't really care about asyncio. You should instead wait until your
favourite async framework switches to it.



> It can be used to have a queue of tasks? Like threads? Maybe light weight threads? Those were my thoughts but the library reference clearly stated that this is single-threaded. So there should be some waiting time in between the tasks. Then what is good?

You can use it to implement a queue of (mostly i/o bound) tasks. You are
not supposed to use it in cases where you'd use threads or lightweight
threads (or green threads, as in gevent or stackless).

Gevent is also technically async but gevent and asyncio differ in a very
subtle way: Gevent does cooperative multitasking whereas Asyncio (and
twisted) does event driven programming.

The difference is that with asyncio, you know exactly when you're
switching to another task -- only when you use "yield from". This is not
always explicit with gevent, as a function that you're calling can
switch to another task without letting your code know.

So with gevent, you still need to take the usual precautions of
multithreaded programming. Gevent actually simulates threads by doing
task switching (or thread scheduling, if you will) in userspace. Here's
its secret sauce:
https://github.com/python-greenlet/greenlet/tree/master/platform

There's some scary platform-dependent assembly code in there! I'd think
twice before seriously relying on it.

Event driven programming does not need such dark magic. You also don't
need to be so careful in a purely event-driven setting as you know that
at any point in time only one task context can be active. It's like you
have an implicit, zero-overhead LOCK ALL for all nonlocal state.

Of course the tradeoff is that you should carefully avoid blocking the
event loop. It's not that hard once you get the hang of it :)

So, I hope this answers your questions. Let me know if I missed something.

Best regards,
Burak

[toc] | [prev] | [next] | [standalone]


#72465

FromAseem Bansal <asmbansal2@gmail.com>
Date2014-06-02 21:54 -0700
Message-ID<2658db0f-ac98-406a-9cd2-fb36a3095afa@googlegroups.com>
In reply to#72431
I haven't worked with asynchronous tasks or concurrent programming so far. Used VB2010 and have used some jQuery in a recent project but nothing low level.

As per the explanation it seems that programming using asyncio would require identifying blocks of code which are not dependent on the IO. Wouldn't that get confusing?

@Terry
When I said that there would be waiting time I meant as compared to sequential programming. I was not comparing to threads.

From all the explanations what I got is that it is the way of doing event driven programming like threads are for concurrent programming. It would have been great if the library reference had mentioned the term event-driven programming. It would have been a great starting point to understand.

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web