Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102789

Re: asyncio and blocking - an update

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Subject Re: asyncio and blocking - an update
Date Thu, 11 Feb 2016 18:07:55 +1100
Lines 81
Message-ID <mailman.36.1455174479.22075.python-list@python.org> (permalink)
References <n9c4p3$gmp$1@ger.gmane.org> <n9h75i$ag1$1@ger.gmane.org> <CAPTjJmrVCkKAEevc9TW8FYYTnZgRUMPHectz+bD=DQRphXYTpw@mail.gmail.com> <n9ha5v$pb9$1@ger.gmane.org>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de eSIMcGSdzw/x9/6TqynMMAkd+XJmmVqJ5j5jN+veKTMw==
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'memory.': 0.05; 'that?': 0.05; 'aggregate': 0.07; 'cc:addr :python-list': 0.09; 'cursor': 0.09; 'indexes': 0.09; 'iterate': 0.09; 'least)': 0.09; 'metrics': 0.09; 'rows': 0.09; 'rows,': 0.09; 'tune': 0.09; 'thread': 0.10; '(at': 0.13; 'properly': 0.15; 'result.': 0.15; 'thu,': 0.15; "(it's": 0.16; '2016': 0.16; 'async': 0.16; 'categories,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'governed': 0.16; 'great!': 0.16; 'iteration': 0.16; 'iteration.': 0.16; 'iterator.': 0.16; 'optimised': 0.16; 'price,': 0.16; 'query.': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'rough': 0.16; 'row': 0.16; 'second-guess': 0.16; 'simple.': 0.16; 'skip:n 70': 0.16; 'thread.': 0.16; 'usage,': 0.16; 'wrote:': 0.16; 'obviously': 0.16; 'memory': 0.17; 'basically': 0.18; 'conjunction': 0.18; 'flexibility': 0.18; 'retrieval': 0.18; 'say,': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'trying': 0.22; 'select': 0.23; 'feb': 0.23; 'wrote': 0.23; 'tables': 0.23; 'second': 0.24; 'somewhere': 0.24; 'written': 0.24; 'header:In-Reply-To:1': 0.24; 'figure': 0.27; 'checking': 0.27; 'separate': 0.27; 'question': 0.27; 'message-id:@mail.gmail.com': 0.27; 'collecting': 0.27; 'function': 0.28; 'block,': 0.29; 'other,': 0.29; 'really,': 0.29; 'subject:update': 0.29; "they'll": 0.29; 'query': 0.30; 'implement': 0.32; 'run': 0.33; 'point': 0.33; 'optimize': 0.33; 'quickly': 0.34; 'list': 0.34; 'received:google.com': 0.35; 'could': 0.35; 'something': 0.35; 'step': 0.36; 'but': 0.36; 'list,': 0.36; 'there': 0.36; 'possible.': 0.36; 'received:209.85': 0.36; '(and': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'received:209': 0.38; 'sure': 0.39; 'does': 0.39; 'enough': 0.39; 'takes': 0.39; 'still': 0.40; 'some': 0.40; 'questions': 0.40; 'future': 0.60; 'high': 0.60; 'your': 0.60; "you'll": 0.61; 'entire': 0.61; 'side': 0.62; 'total': 0.62; 'per': 0.62; 'more': 0.63; 'different': 0.63; 'benefit': 0.66; 'guided': 0.66; 'real-world': 0.66; 'virtually': 0.66; 'buying': 0.67; 'therefore': 0.67; 'worth': 0.67; 'costs': 0.67; 'price': 0.69; 'frank': 0.72; 'sales': 0.73; 'await': 0.76; 'obvious': 0.76; '5:36': 0.84; 'chrisa': 0.84; 'complexity': 0.84; 'quickest': 0.84; 'sound.': 0.84; 'two?': 0.84; 'to:none': 0.91; 'imagine': 0.96
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=4LhS/1RvXojghGQ5Na3QSFE/Ceo7q6ue26l/hvsp1nQ=; b=cv4dDSBoPDCdmOtBJuSG/AGgnqPWOhcVqQCQApbafufs6eFHA0cyYA50E0BTKWq9b+ Tb2v5s0iOouDvafONgJek8Pz17ZIvPV0qUkfO/ShfkCzhfDUxvHgvR4ckWmBBTB1jl8x WHgh4Y2Tdwpu/q7PMhxhl64pk9oB1EG/9CEJxPhhnYqD9jMdwWdRlfzIiGxNpCJd0GdW 2iTsoT62o7X/T/qraOIH5Rpi57i0dJ8gk4CC59QSeHwQK1v3A54MzdhbZUTyh8lC0mJL Wu0CCw1ajM5hNTy4WH9A8jycs5iakNqCtIbyOObVBQuJ57B5HfMrysAQCQUnPGysqt/Y Bvkg==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc:content-type; bh=4LhS/1RvXojghGQ5Na3QSFE/Ceo7q6ue26l/hvsp1nQ=; b=VAk7aQqpWk/qyehntDHNMFODoJahMgQzD9fXJHUEPWfyFFL7VvDOsKFSQXavlI4DBG Oe7/0fI8rjFJX6vKf7tdSUaF6YBvCrgoQDYOMuD5X01a6sxyjzLdgUG0dZvPdIsZrl18 qhESmIxGLTelZLKeEPocFYTuTm6ifJpmVDzH+AAQyAm95DiD6exIQXdlttl4yiQBTM2P 1Fh5jkMaO4t0mC50CXzwcgBNTv9Ci6IiNBIzZTHKKNI5/711JrSLYX8uSUOJIgwclcJA C+QLv+ftno8OzcWVjc6GA/onXwdJ9O1uBHCsVspzJCn/eDL1DarYpkHNgWL5iQ0OeII3 N68A==
X-Gm-Message-State AG10YOQh97znO6DY6yMhg0HoFYZzUkJbRdoI80eBJmCL2L425QtLg+OFfp5aiFWGJ5eGiHLqkEXp+BDataoTgQ==
X-Received by 10.107.132.90 with SMTP id g87mr14720015iod.157.1455174476005; Wed, 10 Feb 2016 23:07:56 -0800 (PST)
In-Reply-To <n9ha5v$pb9$1@ger.gmane.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21rc2
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:102789

Show key headers only | View raw


On Thu, Feb 11, 2016 at 5:36 PM, Frank Millman <frank@chagford.com> wrote:
> "Chris Angelico"  wrote in message
> news:CAPTjJmrVCkKAEevc9TW8FYYTnZgRUMPHectz+bD=DQRphXYTpw@mail.gmail.com...
>>
>>
>> Something worth checking would be real-world database performance metrics
>
>
> [snip lots of valid questions]
>
> My approach is guided by something I read a long time ago, and I don't know
> how true it is, but it feels plausible. This is a rough paraphrase.
>
> Modern databases are highly optimised to execute a query and return the
> result as quickly as possible. A properly written database adaptor will work
> in conjunction with the database to optimise the retrieval of the result.
> Therefore the quickest way to get the result is to let the adaptor iterate
> over the cursor and let it figure out how best to achieve it.
>
> Obviously you still have to tune your query to make make sure it is
> efficient, using indexes etc. But there is no point in trying to
> second-guess the database adaptor in figuring out the quickest way to get
> the result.

As far as that goes, it's sound. (It's pretty obvious that collecting
all the rows into a list is going to take (at least) as long to give
the first row as iteration would take to give the last row, simply
because you could always implement one on top of the other, and
iteration has flexibility that fetchall doesn't.) The only question
is, what price are you paying for that?

> 1.
>    future = loop.run_in_executor('SELECT ...')
>    await future
>    rows = future.result()
>    for row in rows:
>        process row
>
>    The SELECT will not block, because it is run in a separate thread. But it
> will return all the rows in a single list, and the calling function will
> block while it processes the rows, unless it takes the extra step of turning
> the list into an Asynchronous Iterator.

This is beautifully simple.

> 2.
>        rows = AsyncCursor('SELECT ...')
>        async for row in rows:
>            process row

Also beautifully simple. But this one comes with much more complexity
cost in your second thread and your AsyncCursor.

So really, the question is: Is this complexity buying you enough
performance that it's worthwhile? My questions about real-world stats
are based on the flip side of your assumption - to quote it again:

> Modern databases are highly optimised to execute a query and return the
> result as quickly as possible. A properly written database adaptor will work
> in conjunction with the database to optimise the retrieval of the result.
> Therefore the quickest way to get the result is to let the adaptor iterate
> over the cursor and let it figure out how best to achieve it.

A properly-built database will optimize for two things: Time to first
row, and time to query completion. (And other things, like memory
usage, which don't directly affect this discussion.) In some cases,
they'll be very different figures, and then you'll get a lot of
benefit from iteration. In other cases, they'll be virtually the same
- imagine a query that involves a number of tables and lots of
aggregate functions, governed by a big GROUP BY that gathers them all
up into, say, three rows, sorted by one of the aggregate functions (eg
"show me these categories, sorted by the total value of sales per
category"). How long does it take for the database to get the first
row? It has to execute the entire query. How long to get the other
two? Just return 'em from memory. So there's basically no benefit to
this query of iteration above fetchall. Most queries will  be
somewhere in between, hence the question about real-world
significance. If it costs you little to iterate, great! But if you're
paying a high price, it's something to consider.

ChrisA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: asyncio and blocking - an update Chris Angelico <rosuav@gmail.com> - 2016-02-11 18:07 +1100

csiph-web