Path: csiph.com!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail From: "Frank Millman" Newsgroups: comp.lang.python Subject: Re: asyncio and blocking - an update Date: Thu, 11 Feb 2016 08:36:41 +0200 Lines: 77 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de /vPwv6b2Z0Ur8bwupCC8dw+H9r7B4Y0thEEhQ5d+x+nA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'assumed': 0.09; 'cursor': 0.09; 'indexes': 0.09; 'iterate': 0.09; 'metrics': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'rows': 0.09; 'rows,': 0.09; 'sql,': 0.09; 'tune': 0.09; 'def': 0.13; 'properly': 0.15; 'result.': 0.15; 'async': 0.16; 'iterator.': 0.16; 'optimised': 0.16; 'received:80.91.229.3': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'rough': 0.16; 'row': 0.16; 'second- guess': 0.16; 'skip:n 70': 0.16; 'thread.': 0.16; 'obviously': 0.16; 'conjunction': 0.18; 'retrieval': 0.18; 'trying': 0.22; 'select': 0.23; 'wrote': 0.23; 'written': 0.24; 'header:In-Reply- To:1': 0.24; 'header:X-Complaints-To:1': 0.26; 'figure': 0.27; 'checking': 0.27; 'separate': 0.27; 'function': 0.28; 'looks': 0.29; 'block,': 0.29; 'subject:update': 0.29; 'thread,': 0.29; 'query': 0.30; 'skip:_ 10': 0.32; 'maybe': 0.33; 'run': 0.33; 'point': 0.33; 'quickly': 0.34; 'that,': 0.34; 'list': 0.34; 'something': 0.35; 'step': 0.36; 'but': 0.36; 'list,': 0.36; 'there': 0.36; 'possible.': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'received:org': 0.37; 'self': 0.38; 'sure': 0.39; 'takes': 0.39; 'build': 0.40; 'to:addr:python.org': 0.40; 'still': 0.40; 'future': 0.60; 'your': 0.60; 'hope': 0.61; 'guided': 0.66; 'real-world': 0.66; 'therefore': 0.67; 'worth': 0.67; 'approaches': 0.72; 'frank': 0.72; 'await': 0.76; 'quickest': 0.84; 'awaiting': 0.91 X-Injected-Via-Gmane: http://gmane.org/ X-Gmane-NNTP-Posting-Host: 197.89.154.180 In-Reply-To: X-MSMail-Priority: Normal Importance: Normal X-Newsreader: Microsoft Windows Live Mail 15.4.3502.922 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3502.922 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21rc2 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:102788 "Chris Angelico" wrote in message news:CAPTjJmrVCkKAEevc9TW8FYYTnZgRUMPHectz+bD=DQRphXYTpw@mail.gmail.com... > > Something worth checking would be real-world database performance metrics [snip lots of valid questions] My approach is guided by something I read a long time ago, and I don't know how true it is, but it feels plausible. This is a rough paraphrase. Modern databases are highly optimised to execute a query and return the result as quickly as possible. A properly written database adaptor will work in conjunction with the database to optimise the retrieval of the result. Therefore the quickest way to get the result is to let the adaptor iterate over the cursor and let it figure out how best to achieve it. Obviously you still have to tune your query to make make sure it is efficient, using indexes etc. But there is no point in trying to second-guess the database adaptor in figuring out the quickest way to get the result. My theory rests on an assumption which may be faulty. I have assumed that, in order to execute a query using run_in_executor(), the way to get the result is to use cur.fetchall(). Maybe there are alternatives. However, based on that assumption, my theory contrasts the following two approaches - 1. In a separate thread, perform the following - cur.execute('SELECT ...') rows = cur.fetchall() return rows The awaiting function will perform the following - future = loop.run_in_executor('SELECT ...') await future rows = future.result() for row in rows: process row The SELECT will not block, because it is run in a separate thread. But it will return all the rows in a single list, and the calling function will block while it processes the rows, unless it takes the extra step of turning the list into an Asynchronous Iterator. 2. In a separate thread, perform the following - cur.execute('SELECT ...') for row in cur: build up block of 50 rows loop.call_soon_threadsafe(return_queue.put_nowait, block) The awaiting function will call the following - rows = AsyncCursor('SELECT ...') async for row in rows: process row AsyncCursor looks like this (abridged) - def __init__(self, sql, params): loop = asyncio.get_event_loop() self.return_queue = asyncio.Queue() request_queue.put((loop, sql, params, self.return_queue)) self.rows = [] async def __aiter__(self): return self async def __anext__(self): if self.rows: return self.rows.pop(0) self.rows = await self.return_queue.get() return self.rows.pop(0) Hope this makes sense. Frank