Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #86493

asyncio POLLHUP question

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!bcyclone01.am1.xlned.com!bcyclone01.am1.xlned.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <clawsicus@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.039
X-Spam-Evidence '*H*': 0.92; '*S*': 0.00; 'causing': 0.04; 'socket': 0.07; 'prevents': 0.09; 'propagate': 0.09; 'subject:question': 0.10; 'python': 0.11; 'guessing': 0.16; 'sockets': 0.16; 'tcp': 0.16; 'apps': 0.16; 'ignore': 0.16; '(but': 0.19; 'mechanism': 0.19; 'select': 0.22; 'issue.': 0.22; 'error': 0.23; 'source': 0.25; 'handling': 0.26; 'appreciated.': 0.29; 'chris': 0.29; 'errors': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'getting': 0.31; 'discovery': 0.31; 'themselves': 0.32; 'up.': 0.33; 'running': 0.33; 'skip:_ 10': 0.34; 'problem': 0.35; 'knowledge': 0.35; 'connection': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'really': 0.36; 'should': 0.36; 'effort': 0.37; 'server': 0.38; 'to:addr:python-list': 0.38; 'volume': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'event.': 0.60; 'new': 0.61; 'notified': 0.63; 'happen': 0.63; 'within': 0.65; 'covers': 0.68; 'caused': 0.69; 'potentially': 0.81; 'cpu.': 0.84; 'usage.': 0.84; 'exhibiting': 0.91; 'instantly.': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=PZUAEMVIQqgCjs7wUzAPPgZCesx/ByMn/PqtJZtcHA0=; b=gjY1EDrhbH8aQfqOIxF3Rz5gtqHnuYPX2w20ManqfPnOZK+wyftsF8bwRuhZ9hn/PF v71enN88o2MARv+CD2JcXHHUKI2bnlv3cWL+xzkLBnN0AVSvQ1iTltcfQGsLq+4CKwy2 +zEL6NEZTmSmwaGIFC24wW0JEEC5aAA0aXS3A3rZjvTRv4OPZr7tJuGKCxYCGSCtRTk0 7JRYlP7xLbX046u/Rx5jZ2ySDXouPFit4H9KOtELJLfQFZXQfp6m9uj/Yocw9U+so2Z8 MTM16q/EdZf93205mFoG6GIBdkyFKgpnN3zVlVYEZJMEdxq4ohfs2CPIXqBiBe0v8vTd DoZA==
MIME-Version 1.0
X-Received by 10.112.146.66 with SMTP id ta2mr7236628lbb.0.1424950470775; Thu, 26 Feb 2015 03:34:30 -0800 (PST)
Date Thu, 26 Feb 2015 22:04:30 +1030
Subject asyncio POLLHUP question
From Chris Laws <clawsicus@gmail.com>
To python-list@python.org
Content-Type multipart/alternative; boundary=047d7b3a857ca21933050ffc226e
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.19253.1424950477.18130.python-list@python.org> (permalink)
Lines 79
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1424950477 news.xs4all.nl 2905 [2001:888:2000:d::a6]:46687
X-Complaints-To abuse@xs4all.nl
X-Received-Bytes 7402
X-Received-Body-CRC 4168235967
Xref csiph.com comp.lang.python:86493

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

I have a system scenario where thousands of applications are running and
via a service discovery mechanism they all get notified that a service they
are all interesting in has come online. They all attempt to connect a TCP
socket to the service. This happen virtually instantly.

The problem that I see is that many of the applications that try to connect
to the server get themselves into a state where they are consuming a lot of
CPU.

I am using Python 3.4.2, asyncio and have set the server backlog set to
4000 in an effort to accomodate the connection request backlog. I am
actually using an event loop from aiozmq (but no ZMQ sockets in this
scenaio) but under the covers this is just using epoll so it should really
be the same as using the DefaultSelector.

Using strace on the apps exhibiting issues I see that a socket is
continuously triggering a POLLERR|POLLHUP event. This is the cause of the
large CPU usage. The socket is the one that was attempting to connect to
the new service that was just brought up.

I am guessing that the POLLHUP is caused by the server having issues
processing the volume of connect requests.

I think I need to drop/close the socket causing the POLLHUP. However, from
looking through the asyncio source code I don't see how I can do that from
within the _selector.select() or _process_events() functions with only the
knowledge of which fd is causing the issue.

How do poll errors propagate up from the select loop?

I can potentially unregister the fd but I don't think this will trigger the
transport/protocol getting closed (as far as I can tell) which prevents my
normal error handling scenarios from attempting to reconnect to the
service. The asyncio select functions seem to ignore events other than
EVENT_READ and EVENT_WRITE.

Any help would be appreciated.

Regards,
Chris

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

asyncio POLLHUP question Chris Laws <clawsicus@gmail.com> - 2015-02-26 22:04 +1030

csiph-web