Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #69792 > unrolled thread
| Started by | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| First post | 2014-04-07 13:05 +1000 |
| Last post | 2014-04-08 15:19 +0000 |
| Articles | 20 on this page of 105 — 22 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: threading Ben Finney <ben+python@benfinney.id.au> - 2014-04-07 13:05 +1000
Re: threading Roy Smith <roy@panix.com> - 2014-04-06 23:48 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-07 13:56 +1000
Re: threading Roy Smith <roy@panix.com> - 2014-04-07 08:26 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-07 22:34 +1000
Re: threading Roy Smith <roy@panix.com> - 2014-04-07 09:22 -0400
Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-07 14:41 +0100
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 16:49 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 00:27 +1000
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 17:51 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 01:12 +1000
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 00:24 +1000
Re: threading Rick Johnson <rantingrickjohnson@gmail.com> - 2014-04-08 18:09 -0700
Re: threading "Neil D. Cerutti" <neilc@norwich.edu> - 2014-04-09 09:50 -0400
Re: threading Rick Johnson <rantingrickjohnson@gmail.com> - 2014-04-09 08:51 -0700
Re: threading MRAB <python@mrabarnett.plus.com> - 2014-04-09 18:47 +0100
Re: threading Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-10 11:35 +1200
Re: threading Roy Smith <roy@panix.com> - 2014-04-09 19:53 -0400
Re: threading Andrew Berg <robotsondrugs@gmail.com> - 2014-04-09 19:02 -0500
Re: threading Steven D'Aprano <steve@pearwood.info> - 2014-04-10 02:43 +0000
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 13:08 +1000
Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-10 09:23 +0100
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 19:11 +1000
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 04:00 +1000
Re: threading Steven D'Aprano <steve@pearwood.info> - 2014-04-10 03:44 +0000
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 13:54 +1000
Re: threading Ben Finney <ben+python@benfinney.id.au> - 2014-04-07 15:22 +1000
Re: threading Ethan Furman <ethan@stoneleaf.us> - 2014-04-08 11:09 -0700
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 21:41 +0200
Re: threading Grant Edwards <invalid@invalid.invalid> - 2014-04-08 20:30 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 00:32 +0200
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-08 19:17 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 08:10 +0300
Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-06 22:39 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-07 08:46 +0300
Re: threading Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-04-07 19:47 -0400
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 08:19 +0300
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 10:47 +0000
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 15:10 +0300
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 16:37 +0000
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-08 20:17 +0300
Re: threading Roy Smith <roy@panix.com> - 2014-04-08 09:19 -0400
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:44 +0000
Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-08 09:38 -0700
Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-09 14:42 +0100
Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 15:23 +0200
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 16:55 +0300
Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 16:46 +0200
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 20:31 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 03:52 +1000
Re: threading Mark H Harris <harrismh777@gmail.com> - 2014-04-10 08:29 -0500
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 19:20 +0000
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-09 23:47 +1000
Re: threading Roy Smith <roy@panix.com> - 2014-04-09 10:44 -0400
Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-09 16:30 +0200
Re: threading Roy Smith <roy@panix.com> - 2014-04-09 10:52 -0400
Re: threading Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-04-10 11:19 +1200
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-09 19:48 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 00:44 +1000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 15:29 +0000
Re: threading Terry Reedy <tjreedy@udel.edu> - 2014-04-09 12:14 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 02:25 +1000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-09 16:32 +0000
Re: threading Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-04-09 19:44 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 11:05 +1000
Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-10 11:17 +0200
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 19:40 +1000
Re: threading "Frank Millman" <frank@chagford.com> - 2014-04-10 13:10 +0200
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 14:43 +0300
Re: threading Roy Smith <roy@panix.com> - 2014-04-10 08:56 -0400
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-10 15:24 +0000
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 19:20 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-11 01:32 +1000
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 19:25 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-11 03:08 +1000
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 11:14 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 22:44 +0300
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 13:21 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-10 23:44 +0300
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 22:15 -0700
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-10 23:50 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-11 18:36 +0300
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-12 01:53 +1000
Re: threading Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-04-11 16:58 +0100
Re: threading Rustom Mody <rustompmody@gmail.com> - 2014-04-11 11:54 -0700
Re: threading Marko Rauhamaa <marko@pacujo.net> - 2014-04-11 22:27 +0300
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 01:51 +0200
Re: threading Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-11 05:35 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 09:26 +0000
Re: threading Roy Smith <roy@panix.com> - 2014-04-11 08:36 -0400
Re: threading Grant Edwards <invalid@invalid.invalid> - 2014-04-11 16:18 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-11 02:21 +0200
Re: threading Terry Reedy <tjreedy@udel.edu> - 2014-04-10 20:23 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-10 21:19 +1000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 02:06 +0000
Re: threading alister <alister.nospam.ware@ntlworld.com> - 2014-04-08 11:07 +0000
Re: threading Roy Smith <roy@panix.com> - 2014-04-08 09:13 -0400
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-08 23:23 +1000
Re: threading alister <alister.nospam.ware@ntlworld.com> - 2014-04-08 14:15 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 16:06 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:40 +0000
Re: threading Paul Rubin <no.email@nospam.invalid> - 2014-04-08 09:46 -0700
Re: threading Chris Angelico <rosuav@gmail.com> - 2014-04-09 02:46 +1000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 17:17 +0000
Re: threading Sturla Molden <sturla.molden@gmail.com> - 2014-04-08 15:19 +0000
Page 4 of 6 — ← Prev page 1 2 3 [4] 5 6 Next page →
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-04-09 12:14 -0400 |
| Message-ID | <mailman.9082.1397060085.18130.python-list@python.org> |
| In reply to | #69832 |
On 4/9/2014 10:30 AM, Frank Millman wrote: > In other words, non-blocking implies that everything required to pass off > the request to a handler and be ready to deal with the next one must already > be in memory, and it must not rely on communicating with any outside > resource at all. Is this correct? Chris said no, I would have said yes, but I think we understand the above differently. The important point is that there are two goals. The first is to avoid having the cpu sitting idle when there is work to be done. Switching processes, switching threads within a process, and switching tasks within a thread are all aimed at this. (So are compiler code rearrangements that aim to keep various parts of a cpu, such integer and float arithmetic units, active simultaneously.) The second, usually, is to keep the system responsive by not letting any particular work unit hog the cpu. But note that is work units are made too small, cpu time is wasted in excessive switching overhead. A handler should neither waste nor monopolize cpu time. If input data is needed for a long computation, the handler should store the data where it needs to be for the computation but leave the actual computation to a background or idle task that runs when there is nothing else to do. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-10 02:25 +1000 |
| Message-ID | <mailman.9084.1397060752.18130.python-list@python.org> |
| In reply to | #69832 |
On Thu, Apr 10, 2014 at 2:14 AM, Terry Reedy <tjreedy@udel.edu> wrote: > On 4/9/2014 10:30 AM, Frank Millman wrote: > >> In other words, non-blocking implies that everything required to pass off >> the request to a handler and be ready to deal with the next one must >> already >> be in memory, and it must not rely on communicating with any outside >> resource at all. Is this correct? > > > Chris said no, I would have said yes, but I think we understand the above > differently. I said no because I see asynchronous I/O as a perfectly viable structure for a program, which means that a non-blocking handler is allowed to communicate with outside resources. Conversely, if you see "a non-blocking handler" as meaning the one small piece that runs uninterruptibly, then you might say that yes, it must not rely on any outside resource. Of course, it depends on where you're looking. Memory is itself an outside resource that can potentially take a long time to give a result - just look at what happens when you dip into swap space, and RAM accesses become disk accesses. But generally, you go asynchronous in order to increase your throughput; and if you're churning through your page file, well, that's going to kill throughput whichever way you look at it. It's generally safe enough to pretend that RAM can be accessed in-line, and worry about the slowdowns elsewhere. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Sturla Molden <sturla.molden@gmail.com> |
|---|---|
| Date | 2014-04-09 16:32 +0000 |
| Message-ID | <mailman.9085.1397061189.18130.python-list@python.org> |
| In reply to | #69832 |
Sturla Molden <sturla.molden@gmail.com> wrote: > 3. It is nice to be able to abort a read or write that hangs (for whatever > reason). Killing a thread with pthread_cancel or TerminateThread is not > recommended. While "graceful timeout" is easy to do on Unix, using fcntl.fcntl or signal.alarm, on Windows it requires overlapped I/O. This means the normal Python file objects cannot be used for this purpose on Windows. Sturla
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2014-04-09 19:44 -0400 |
| Message-ID | <mailman.9099.1397087086.18130.python-list@python.org> |
| In reply to | #69832 |
On Wed, 9 Apr 2014 23:47:04 +1000, Chris Angelico <rosuav@gmail.com>
declaimed the following:
>won't block. You might think "Duh, how can printing to the screen
>block?!?", but if your program's output is being piped into something
>else, it most certainly can :) If that were writing to a remote
Heck, even if it isn't blocking per se, it may still be enough to slow
down the whole system (over the past year I've had to characterize through
put on some systems -- and the console logging of "exceptions"* slowed the
overall data rate significantly)
* The unit providers' idea of "exception to be logged" just happened to be
something our intended application considered normal; hence our test data
produced LOTS of "exceptions".
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-10 11:05 +1000 |
| Message-ID | <mailman.9103.1397091924.18130.python-list@python.org> |
| In reply to | #69832 |
On Thu, Apr 10, 2014 at 9:44 AM, Dennis Lee Bieber
<wlfraed@ix.netcom.com> wrote:
> On Wed, 9 Apr 2014 23:47:04 +1000, Chris Angelico <rosuav@gmail.com>
> declaimed the following:
>
>>won't block. You might think "Duh, how can printing to the screen
>>block?!?", but if your program's output is being piped into something
>>else, it most certainly can :) If that were writing to a remote
>
> Heck, even if it isn't blocking per se, it may still be enough to slow
> down the whole system (over the past year I've had to characterize through
> put on some systems -- and the console logging of "exceptions"* slowed the
> overall data rate significantly)
Oh yes, definitely. Console output can be *slow*. Back in my earliest
programming days, I'd often have a program that iterated over sub-jobs
from either 0 or 1 up to some unknown top (so I can't show a
percent-done), and the obvious thing to do is (rewritten in Python):
i = 0
while stuff_to_do():
i += 1
print(i, end="\r")
do_more_stuff()
print(i)
Hmm, that's really slow. I know! I'll speed this up by printing out
only once a second. That should be way faster, right? Let's see.
i = time_printed = 0
while stuff_to_do():
i += 1
if int(time.time()) != time_printed:
print(i, end="\r")
time_printed = int(time.time())
do_more_stuff()
print(i)
And that made it... waaaay slower. Turns out clock querying (at least
on those systems) is pretty slow too, even more so than console
output. Of course, what we ended up settling on was something like
this, which *does* make sense:
i = 0
while stuff_to_do():
i += 1
if i & 255 == 0: print(i, end="\r")
do_more_stuff()
print(i)
replacing 255 with any number one less than a power of two, so it'd
print out every however-many-th (in this case, every 256th), using
bitwise operations rather than division.
But yeah, console output isn't something you want when you're going
for maximum throughput. Heh.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "Frank Millman" <frank@chagford.com> |
|---|---|
| Date | 2014-04-10 11:17 +0200 |
| Message-ID | <mailman.9129.1397121485.18130.python-list@python.org> |
| In reply to | #69832 |
"Chris Angelico" <rosuav@gmail.com> wrote in message
news:CAPTjJmq2xx_WG2ymCC0NNqisDO=DNnJhneGPiD3DE+xeiy5hjg@mail.gmail.com...
> On Thu, Apr 10, 2014 at 12:30 AM, Frank Millman <frank@chagford.com>
> wrote:
>>
>>>
>>>> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
>>>> either/or, or is it some arbitrary timeout - if a handler returns
>>>> within
>>>> that time it is non-blocking, but if it exceeds it it is blocking?
>>>
>>> No; a blocking request is one that waits until it has a response, and
>>> a non-blocking request is one that goes off and does something, and
>>> then comes back to you when it's done.
>>
Thanks for that clarification - I think I've got it now.
>>> def nonblocking_query(id):
>>> print("Finding out who employee #%d is..."%id)
>>> def nextstep(res):
>>> print("Employee #%d is %s."%(id,res[0].name))
>>> db.asyncquery(nextstep, "select name from emp where id=12345")
>>>
>>
>> In this example, what is 'db.asyncquery'?
>>
>> If you mean that you have a separate thread to handle database queries,
>> and
>> you use a queue or other message-passing mechanism to hand it the query
>> and
>> get the result, then I understand it. If not, can you explain in more
>> detail.
>
> It's an imaginary function that would send a request to the database,
> and then call some callback function when the result arrives. If the
> database connection is via a TCP/IP socket, that could be handled by
> writing the query to the socket, and then when data comes back from
> the socket, looking up the callback and calling it. There's no
> additional thread here.
>
I need some time to get my head around that, but meanwhile can you resolve
this stumbling block?
The current version of my program uses HTTP. As I understand it, a client
makes a connection and submits a request. The server processes the request
and returns a result. The connection is then closed.
In this scenario, does async apply at all? There is no open connection to
'select' or 'poll'. You have to ensure that the request handler does not
block the entire process, so that the main loop is ready to accept more
connections. But passing the request to a thread for handling seems an
effective solution.
Am I missing something?
Frank
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-10 19:40 +1000 |
| Message-ID | <mailman.9130.1397122866.18130.python-list@python.org> |
| In reply to | #69832 |
On Thu, Apr 10, 2014 at 7:17 PM, Frank Millman <frank@chagford.com> wrote:
> The current version of my program uses HTTP. As I understand it, a client
> makes a connection and submits a request. The server processes the request
> and returns a result. The connection is then closed.
>
> In this scenario, does async apply at all? There is no open connection to
> 'select' or 'poll'. You have to ensure that the request handler does not
> block the entire process, so that the main loop is ready to accept more
> connections. But passing the request to a thread for handling seems an
> effective solution.
Let's take this to a slightly lower level. HTTP is built on top of a
TCP/IP socket. The client connects (usually on port 80), and sends a
string like this:
"""GET /foo/bar/asdf.html HTTP/1.0
Host: www.spam.org
User-Agent: Mozilla/5.0
"""
The server then sends back something like this:
"""HTTP/1.0 200 OK
Content-type: text/html
<html>
<body>
Hello, world!
</body>
</html>
"""
These are carried on a straight-forward bidirectional stream socket,
so the write and read operations (or send and recv, either way) can
potentially block. With a small request, you can kinda assume that the
write won't block, but the read most definitely will: it'll block
until the server writes something for you.
So it follows the usual model of blocking vs non-blocking. In blocking
mode, you do something like this:
data = socket.read()
and it waits until it has something to return. In non-blocking mode,
you do something like this:
def data_available(socket, data):
# whatever
socket.set_read_callback(data_available)
An HTTP handling library can then build a non-blocking request handler
on top of that, by having data_available parse out the appropriate
information, and return if it doesn't have enough content yet. So it
follows the same model; you send off the request (and don't wait for
it), and then get notified when the result is there.
When you write the server, you effectively have the same principle,
with one additional feature: a listening socket becomes readable
whenever someone connects. So you can select() on that socket, just
like you can with the others, and whenever there's a new connection,
you add it to the collection and listen for requests on all of them.
It's basically the same concept; as soon as you can accept a new
connection, you do so, and then go back to the main loop.
It's pretty simple when you let a lower-level library do the work for
you :) The neat thing is, you can put all of this into a single
program; I can't demo it in Python for you, but I have a Pike kernel
that I wrote for my last job, which can handle a variety of different
asynchronous operations: TCP, UDP (which just sends single packets,
normally), a GUI (in theory), timers, the lot. It has convenience
features for creating a DNS server, an HTTP server, and a stateful
line-based server (covers lots of other protocols, like SMTP). And
(though this bit would be hard to port to Python) it can update itself
without shutting down. Yes, it can take some getting your head around,
but it's well worth it.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "Frank Millman" <frank@chagford.com> |
|---|---|
| Date | 2014-04-10 13:10 +0200 |
| Message-ID | <mailman.9131.1397128288.18130.python-list@python.org> |
| In reply to | #69832 |
"Chris Angelico" <rosuav@gmail.com> wrote in message
news:CAPTjJmoWaHPZk=DAxbfJ=9ez2aj=4yf2C8WMbRYoF5VgN6Exsw@mail.gmail.com...
> On Thu, Apr 10, 2014 at 7:17 PM, Frank Millman <frank@chagford.com> wrote:
>> The current version of my program uses HTTP. As I understand it, a client
>> makes a connection and submits a request. The server processes the
>> request
>> and returns a result. The connection is then closed.
>>
>> In this scenario, does async apply at all? There is no open connection to
>> 'select' or 'poll'. You have to ensure that the request handler does not
>> block the entire process, so that the main loop is ready to accept more
>> connections. But passing the request to a thread for handling seems an
>> effective solution.
>
[...]
Thanks, Chris - I am learning a lot!
I have skipped the first part of your reply, as it seems to refer to the
client. I am using a web browser as a client, so I don't have to worry about
programming that.
>
> When you write the server, you effectively have the same principle,
> with one additional feature: a listening socket becomes readable
> whenever someone connects. So you can select() on that socket, just
> like you can with the others, and whenever there's a new connection,
> you add it to the collection and listen for requests on all of them.
> It's basically the same concept; as soon as you can accept a new
> connection, you do so, and then go back to the main loop.
>
This is where it gets interesting. At present I am using cherrypy as a
server, and I have not checked its internals. However, in the past I have
dabbled with writing server programs like this -
while self.running:
try:
conn,addr = self.s.accept()
Session(args=(self, conn)).start()
except KeyboardInterrupt:
self.shutdown()
In this scenario, the loop blocks on 'accept'.
You seem to be suggesting that I set the socket to 'non-blocking', use
select() to determine when a client is trying to connect, and then call
'accept' on it to create a new connection.
If so, I understand your point. The main loop changes from 'blocking' to
'non-blocking', which frees it up to perform all kinds of other tasks as
well. It is no longer just a 'web server', but becomes an 'all-purpose
server'.
Much food for thought!
Frank
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-10 14:43 +0300 |
| Message-ID | <87wqexmmuc.fsf@elektro.pacujo.net> |
| In reply to | #70029 |
"Frank Millman" <frank@chagford.com>: > You seem to be suggesting that I set the socket to 'non-blocking', use > select() to determine when a client is trying to connect, and then > call 'accept' on it to create a new connection. Yes. > If so, I understand your point. The main loop changes from 'blocking' > to 'non-blocking', which frees it up to perform all kinds of other > tasks as well. It is no longer just a 'web server', but becomes an > 'all-purpose server'. The server will do whatever you make it do. Other points: * When you wake up from select() (or poll(), epoll()), you should treat it as a hint. The I/O call (accept()) could still raise socket.error(EAGAIN). * The connections returned from accept() have to be individually registered with select() (poll(), epoll()). * When you write() into a connection, you may be able to send only part of the data or get EAGAIN. You need to choose a buffering strategy -- you should not block until all data is written out. Also take into account how much you are prepared to buffer. * There are two main modes of multiplexing: level-triggered and edge-triggered. Only epoll() (and kqueue()) support edge-triggered wakeups. Edge-triggered requires more discipline from the programmer but frees you from having to tell the multiplexing facility if you are interested in readability or writability in any given situation. Edge-triggered wakeups are only guaranteed after you have gotten an EAGAIN from an operation. Make sure you keep on reading/writing until you get an EAGAIN. On the other hand, watch out so one connection doesn't hog the process because it always has active I/O to perform. * You should always be ready to read to prevent deadlocks. * Sockets can be half-closed. Your state machines should deal with the different combinations gracefully. For example, you might read an EOF from the client socket before you have pushed the response out. You must not close the socket before the response has finished writing. On the other hand, you should not treat the half-closed socket as readable. * While a single-threaded process will not have proper race conditions, you must watch out for preemption. IOW, you might have Object A call a method of Object B, which calls some other method of Object A. Asyncio has a task queue facility. If you write your own main loop, you should also implement a similar task queue. The queue can then be used to make such tricky function calls in a safe context. * Asyncio provides timers. If you write your own main loop, you should also implement your own timers. Note that modern software has to tolerate suspension (laptop lid, virtual machines). Time is a tricky concept when your server wakes up from a coma. * Specify explicit states. Your connection objects should have a data member named "state" (or similar). Make your state transitions explicit and obvious in the code. In fact, log them. Resist the temptation of deriving the state implicitly from other object information. * Most states should be guarded with a timer. Make sure to document for each state, which timers are running. * In each state, check that you handle all possible events and timeouts. The state/transition matrix will be quite sizable even for seemingly simple tasks. Marko
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-04-10 08:56 -0400 |
| Message-ID | <roy-81616C.08563310042014@news.panix.com> |
| In reply to | #70031 |
In article <87wqexmmuc.fsf@elektro.pacujo.net>, Marko Rauhamaa <marko@pacujo.net> wrote: > * When you wake up from select() (or poll(), epoll()), you should treat > it as a hint. The I/O call (accept()) could still raise > socket.error(EAGAIN). People often misunderstand what select() does. The common misconception is that a select()ed descriptor has data waiting to be read. What the man page says is, "A file descriptor is considered ready if it is possible to perform the corresponding I/O operation (e.g., read(2)) without blocking." Not blocking includes failing immediately. And, once you introduce threading, things get even more complicated. Imagine two threads, both waiting in a select() call on the same socket. Data comes in on that socket. Both select() calls return. If both threads then do reads on the socket, you've got a race condition. One of them will read the data. The other will block in the read call, because the data has already been read by the other thread! So, yes, as Marko says, use select() as a hint, but then also do your reads in non-blocking mode, and be prepared for them to fail, regardless of whether select() said the descriptor was ready. > Note that modern software has to tolerate suspension (laptop lid, > virtual machines). Time is a tricky concept when your server wakes up > from a coma. Not to mention running in a virtual machine. Time is an equally tricky concept when your hardware clock is really some other piece of software playing smoke and mirrors. I once worked on a time-sensitive system which was running in a VM. The idiots who had configured the thing were running ntpd in the VM, to keep its clock in sync. Normally, this is a good thing, but they were ALSO using the hypervisor's clock management gizmo (vmtools?) to adjust the VM clock. The two mechanisms were fighting with each other, which did really weird stuff to time. It took me forever to figure out what was going on. How does one even observe that time is moving around randomly? I eventually ended up writing a trivial NTP client in Python (it's only a few lines of code) and periodically logging the difference between the local system clock and what my NTP reference was telling me. Of course, figuring out what was going on was the easy part. Convincing the IT drones to fix the problem was considerably more difficult. > * In each state, check that you handle all possible events and > timeouts. The state/transition matrix will be quite sizable even for > seemingly simple tasks. And, those empty boxes in the state transition matrix which are blank, because those transitions are impossible? Guess what, they happen, and you better have a plan for when they do :-)
[toc] | [prev] | [next] | [standalone]
| From | Sturla Molden <sturla.molden@gmail.com> |
|---|---|
| Date | 2014-04-10 15:24 +0000 |
| Message-ID | <mailman.9140.1397143500.18130.python-list@python.org> |
| In reply to | #70031 |
Marko Rauhamaa <marko@pacujo.net> wrote: > Other points: > > * When you wake up from select() (or poll(), epoll()), you should treat > it as a hint. The I/O call (accept()) could still raise > socket.error(EAGAIN). > > * The connections returned from accept() have to be individually > registered with select() (poll(), epoll()). > > * When you write() into a connection, you may be able to send only part > of the data or get EAGAIN. You need to choose a buffering strategy -- > you should not block until all data is written out. Also take into > account how much you are prepared to buffer. > > * There are two main modes of multiplexing: level-triggered and > edge-triggered. Only epoll() (and kqueue()) support edge-triggered > wakeups. Edge-triggered requires more discipline from the programmer > but frees you from having to tell the multiplexing facility if you > are interested in readability or writability in any given situation. > > Edge-triggered wakeups are only guaranteed after you have gotten an > EAGAIN from an operation. Make sure you keep on reading/writing until > you get an EAGAIN. On the other hand, watch out so one connection > doesn't hog the process because it always has active I/O to perform. > > * You should always be ready to read to prevent deadlocks. > > * Sockets can be half-closed. Your state machines should deal with the > different combinations gracefully. For example, you might read an EOF > from the client socket before you have pushed the response out. You > must not close the socket before the response has finished writing. > On the other hand, you should not treat the half-closed socket as > readable. > > * While a single-threaded process will not have proper race conditions, > you must watch out for preemption. IOW, you might have Object A call > a method of Object B, which calls some other method of Object A. > Asyncio has a task queue facility. If you write your own main loop, > you should also implement a similar task queue. The queue can then be > used to make such tricky function calls in a safe context. > > * Asyncio provides timers. If you write your own main loop, you should > also implement your own timers. > > Note that modern software has to tolerate suspension (laptop lid, > virtual machines). Time is a tricky concept when your server wakes up > from a coma. > > * Specify explicit states. Your connection objects should have a data > member named "state" (or similar). Make your state transitions > explicit and obvious in the code. In fact, log them. Resist the > temptation of deriving the state implicitly from other object > information. > > * Most states should be guarded with a timer. Make sure to document for > each state, which timers are running. > > * In each state, check that you handle all possible events and > timeouts. The state/transition matrix will be quite sizable even for > seemingly simple tasks. And exactly how is getting all of this correct any easier than just using threads and blocking i/o? I'd like to see the programmer who can get all of this correct, but has no idea how to use a queue og mutex without deadlocking. Sturla
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-10 19:20 +0300 |
| Message-ID | <87ppkpma0d.fsf@elektro.pacujo.net> |
| In reply to | #70043 |
Sturla Molden <sturla.molden@gmail.com>: > And exactly how is getting all of this correct any easier than just > using threads and blocking i/o? > > I'd like to see the programmer who can get all of this correct, but > has no idea how to use a queue og mutex without deadlocking. My personal experience is that it is easier to get "all of this correct" than threads. I've done it both ways. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-11 01:32 +1000 |
| Message-ID | <mailman.9141.1397143980.18130.python-list@python.org> |
| In reply to | #70031 |
On Fri, Apr 11, 2014 at 1:24 AM, Sturla Molden <sturla.molden@gmail.com> wrote: > And exactly how is getting all of this correct any easier than just using > threads and blocking i/o? For a start, nearly everything Marko just posted should be dealt with by your library. I don't know Python's asyncio as it's very new and I haven't yet found an excuse to use it, but with Pike, I just engage backend mode, set callbacks on the appropriate socket/file/port objects, and let things happen perfectly. All I need to do is check a few return values (eg if I ask a non-blocking socket to write a whole pile of data, it might return that it wrote only some of it, in which case I have to buffer the rest - not hard but has to be done), and make sure I always return promptly from my callbacks so as to avoid lagging out other operations. None of the details of C-level APIs matter to my high level code. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-10 19:25 +0300 |
| Message-ID | <87lhvdm9sw.fsf@elektro.pacujo.net> |
| In reply to | #70044 |
Chris Angelico <rosuav@gmail.com>: > For a start, nearly everything Marko just posted should be dealt with > by your library. Let's not kid ourselves: it is hard to get any reactive system right. > I don't know Python's asyncio as it's very new and I haven't yet found > an excuse to use it, but with Pike, I just engage backend mode, set > callbacks on the appropriate socket/file/port objects, and let things > happen perfectly. That "set callbacks" and "let things happen" is the hard part. The framework part is trivial. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-04-11 03:08 +1000 |
| Message-ID | <mailman.9143.1397149733.18130.python-list@python.org> |
| In reply to | #70047 |
On Fri, Apr 11, 2014 at 2:25 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> I don't know Python's asyncio as it's very new and I haven't yet found
>> an excuse to use it, but with Pike, I just engage backend mode, set
>> callbacks on the appropriate socket/file/port objects, and let things
>> happen perfectly.
>
> That "set callbacks" and "let things happen" is the hard part. The
> framework part is trivial.
Maybe. Here's a simple self-contained Pike program that makes a simple
echo server - whatever comes in goes out again:
//Create the port (listening connection).
object mainsock=Stdio.Port(12345,accept_callback);
void accept_callback()
{
//Get the newly-connected socket
object sock=mainsock->accept();
//Set up its callbacks
sock->set_nonblocking(read_callback, write_callback, close_callback);
//Keep track of metadata (here that'll just be the write buffer)
sock->set_id((["sock":sock]));
}
//Attempt to write some text, buffering any that can't be written
void write(mapping info, string text)
{
if (!text || text=="") return;
if (info->write_me)
{
//There's already buffered text. Queue this text too.
info->write_me += text;
return;
}
int written = info->sock->write(text);
if (written < 0)
{
//Deal with write errors brutally by closing the socket.
info->sock->close();
return;
}
info->write_me = text[written..];
}
//When more can be written, write it.
void write_callback(mapping info) {write(info, m_delete(info,"write_me"));}
void read_callback(mapping info, string data)
{
//Simple handling: Echo the text back with a prefix.
//Note that this isn't line-buffered or anything.
write(info, ">> " + data);
}
//Not strictly necessary, but if you need to do something when a client
//disconnects, this is where you'd do it.
void close_callback(mapping info)
{
info->sock = "(disconnected)";
}
//Engage backend mode.
int main() {return -1;}
Setting callbacks? One line. There's a little complexity to the "write
what you can, buffer the rest", but if you're doing anything even a
little bit serious, you'll just bury that away in a mid-level library
function. The interesting part is in the read callback, which does the
actual work (in this case, it just writes back whatever it gets). And
here's how easy it is to make it into a chat server: just replace the
read and close callbacks with these:
multiset(mapping) sockets=(<>);
void read_callback(mapping info, string data)
{
//Simple handling: Echo the text back with a prefix.
//Note that this isn't line-buffered or anything.
sockets[info] = 1;
write(indices(sockets)[*], ">> " + data);
}
//Not strictly necessary, but if you need to do something when a client
//disconnects, this is where you'd do it.
void close_callback(mapping info)
{
info->sock = "(disconnected)";
sockets[info] = 0;
}
If you want to handle more information (maybe get users to log in?),
you just stuff more stuff into the info mapping (it's just like a
Python dict). Handling of TELNET negotiation, line buffering, etc,
etc, can all be added between this and the user-level code - that's
what I did with the framework I wrote for work. Effectively, you just
write one function (I had it double as the read and close callbacks
for simplicity), put a declaration down the bottom to say what port
number you want (hard coded to 12345 in the above code), and
everything just happens. It really isn't hard to get callback-based
code to work nicely if you think about what you're doing.
I expect it'll be similarly simple with asyncio; does someone who's
worked with it feel like implementing similar functionality?
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-10 11:14 -0700 |
| Message-ID | <70d69403-456a-43bc-84a6-c546983c90e5@googlegroups.com> |
| In reply to | #70048 |
On Thursday, April 10, 2014 10:38:49 PM UTC+5:30, Chris Angelico wrote: > On Fri, Apr 11, 2014 at 2:25 AM, Marko Rauhamaa wrote: > >> I don't know Python's asyncio as it's very new and I haven't yet found > >> an excuse to use it, but with Pike, I just engage backend mode, set > >> callbacks on the appropriate socket/file/port objects, and let things > >> happen perfectly. > > > > That "set callbacks" and "let things happen" is the hard part. The > > framework part is trivial. > > Maybe. Here's a simple self-contained Pike program that makes a simple > echo server - whatever comes in goes out again: > For analogy let me take a 'thought-discussion' between a C programmer and a python programmer regarding data structures. ----------------------------------------------------- PP: Is it not tedious and error prone, C's use of data structures? How/Why do you stick to that? CP: Oh! Is it? And what do you propose I use? PP: Why python of course! Or any modern language with first class data and garbage collection! Why spend a lifetime tracking malloc errors?! CP: Oh! is it? And what is python implemented in? PP: But thats the whole point! Once Guido-n-gang have done their thing we are unscathed by the bugs that prick and poke and torment you day in day out. CP: Lets look at this in more detail shall we? PP: Very well CP: You give me any python data structure (so-called) and I'll give it to you in C. And note: Its very easy. I just open up the python implementation (its in C in case you forgot) and clean up all the mess that has been added for the support of lazy python programmers. In addition, I'll give you a couple of more data-structures/algorithms that we have easy access to but for you, your only choice is to drop into C to use (HeHe!) PP: You are setting the rules of the game... and winning. I did not say I want fancy algorithms and data structures. I said I want (primarily) the safety of garbage collection. Its also neat to have an explicit syntax for basic data types like lists rather than scrummaging around with struct and malloc and pointers (hoo boy!) CP: Yeah.. Like I said you like to be mollycoddled; we like our power and freedom ----------------------------------------------- If I may use somewhat heavy brush-strokes: Marco (and evidently Chris) are in the CP camp whereas Sturla is in the PP camp. Its just the 'data-structures (and algorithms)' is now replaced by 'concurrency' Both these viewpoints assume that the status quo of current (mainstream) language support for concurrency is a given and not negotiable. Erlang/Go etc disprove this.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-10 22:44 +0300 |
| Message-ID | <87wqexj7ge.fsf@elektro.pacujo.net> |
| In reply to | #70052 |
Rustom Mody <rustompmody@gmail.com>:
> Marco (and evidently Chris) are in the CP camp whereas Sturla is in
> the PP camp. Its just the 'data-structures (and algorithms)' is now
> replaced by 'concurrency'
>
> Both these viewpoints assume that the status quo of current
> (mainstream) language support for concurrency is a given and not
> negotiable.
I think you misread me (us?). I'm not trying to make life hard on
myself. Nor am I disparaging fitting abstractions and high-level
utilities.
Threads are an essential tool when used appropriately. However, I do
believe the 90's fad of treating them like a silver bullet of
concurrency was a big mistake. The industry is noticing it, as is
evident in NIO and asyncio.
Threads are enticing in that they make it quick to put together working
prototypes. The difficulties only appear when it's too late to go back.
They definitely are not the high-level abstraction you're looking for.
> Erlang/Go etc disprove this.
<URL: http://en.wikipedia.org/wiki/Leonhard_Euler#
Personal_philosophy_and_religious_beliefs>:
n
a + b
Sir, ------ = x, hence God exists—reply!
n
Seriously, Erlang (and Go) have nice tools for managing state machines
and concurrency. However, Python (and C) are perfectly suitable for
clear asynchronous programming idioms. I'm happy that asyncio is
happening after all these long years. It would be nice if it supported
edge-triggered wakeups, but I suppose that isn't supported in all
operating systems.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-10 13:21 -0700 |
| Message-ID | <685d594b-c31b-4629-b81d-4aa64d9e3394@googlegroups.com> |
| In reply to | #70058 |
On Friday, April 11, 2014 1:14:01 AM UTC+5:30, Marko Rauhamaa wrote:
>
>
> Seriously, Erlang (and Go) have nice tools for managing state machines
> and concurrency. However, Python (and C) are perfectly suitable for
> clear asynchronous programming idioms. I'm happy that asyncio is
> happening after all these long years. It would be nice if it supported
> edge-triggered wakeups, but I suppose that isn't supported in all
> operating systems.
>
Yes... Let me restate what (I hear you as) saying
Lets start with pure uniprocessor machines for ease of discussion (also of history)
An OS, sits between the uni-hardware and provides multi{processing,users,threads,etc}.
How does it do it? By the mechanisms process-switching, interleaving etc
In short all the good-stuff... that constitutes asyncio (and relations)
What you are saying is that what the OS is doing, you can do better.
Analogous to said C programmer saying that what (data structures) the python
programmer can make he can do better.
Note I dont exactly agree with Sturla either.
To see that time-shift the C/Python argument 30 years back when it was imperative
languages vs poorly implemented, buggy, interpreted Lisp/Prolog.
In that world, your 'I'd rather do it by hand/work out my state machine'
would make considerable sense.
Analogously, if the only choice were mainstream (concurrency-wise) languages --
C/C++/Java/Python -- + native threads + overheads + ensuing errors/headaches, then
the: "Please let me work out my state machine and manage my affairs" would be sound.
But its not the only choice!!
> http://en.wikipedia.org/wiki/Leonhard_Euler#Personal_philosophy_and_religious_beliefs
>
> n
> a + b
> Sir, ------ = x, hence God exists--reply!
> n
I always thought that God exists because was e^(ipi) + 1 = 0 :D
Evidently (s)he has better reasons for existing!
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-04-10 23:44 +0300 |
| Message-ID | <87sipkkj7p.fsf@elektro.pacujo.net> |
| In reply to | #70060 |
Rustom Mody <rustompmody@gmail.com>:
> What you are saying is that what the OS is doing, you can do better.
> Analogous to said C programmer saying that what (data structures) the
> python programmer can make he can do better.
I'm sorry, but I don't quite follow you there.
I see the regular multithreaded approach as
(1) oversimplification which makes it difficult to extend the design
and handle all of the real-world contingencies
(2) inviting race conditions carelessly--no mortal is immune.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2014-04-10 22:15 -0700 |
| Message-ID | <b86c99de-f780-4c4a-904a-f5b054f4e891@googlegroups.com> |
| In reply to | #70061 |
On Friday, April 11, 2014 2:14:42 AM UTC+5:30, Marko Rauhamaa wrote: > Rustom Mody: > > > What you are saying is that what the OS is doing, you can do better. > > Analogous to said C programmer saying that what (data structures) the > > python programmer can make he can do better. > > > > I'm sorry, but I don't quite follow you there. Ok let me try again (Please note I am speaking more analogically than logically) There was a time -- say 1990 -- when there was this choice - use C -- a production language with half-assed data structures support - use Lisp -- strong support for data structures but otherwise unrealistic From this world and its world view its natural to conclude that to choose a strong data structure supporting language is to choose an unrealistic language I was in the thick of this debate then http://www.the-magus.in/Publications/chor.pdf This argument is seen to be fallacious once we have languages like python (and Ruby and Java and Perl and Haskell and ...) Today we are in the same position vis-a-vis concurrency as we were with data structures in 1990. We have mainstream languages -- Java,C,C++,Python -- with half-assed concurrency support. And we have languages like Erlang, Go, Cloud Haskell which make concurrency center-stage but are otherwise lacking and unrealistic. I disagree with you in saying "We cant do better (than stay within the options offered by mainstream languages" As an individual you are probably right. From a larger systemic pov (hopefully!) not! I disagree with Sturla in what is considered invariant and what is under one's control. He (seems?) to take hardware as under control, programming paradigm as not. I believe that the mileage that can be achieved by working on both is more than can be achieved by either alone. > I see the regular multithreaded approach as > (2) inviting race conditions carelessly--no mortal is immune. This I understand and concur with > > (1) oversimplification which makes it difficult to extend the design > and handle all of the real-world contingencies This I dont...
[toc] | [prev] | [next] | [standalone]
Page 4 of 6 — ← Prev page 1 2 3 [4] 5 6 Next page →
Back to top | Article view | comp.lang.python
csiph-web