Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #2828

Re: Tips on Speeding up Python Execution

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.04; 'folks': 0.04; 'chunk': 0.07; 'concurrently': 0.07; 'option,': 0.07; 'python': 0.07; 'fetch': 0.09; 'it;': 0.09; 'sockets': 0.09; 'spawn': 0.09; 'pm,': 0.11; 'request,': 0.14; 'say,': 0.14; 'wrote:': 0.14; 'extracting': 0.16; 'threading': 0.16; 'functions,': 0.19; 'loading': 0.19; 'perl': 0.19; 'code': 0.22; 'header:In-Reply-To:1': 0.22; 'help,': 0.22; 'received:209.85.214.174': 0.23; 'received:mail- iw0-f174.google.com': 0.23; "what's": 0.24; 'calling': 0.25; 'easiest': 0.25; 'pages,': 0.25; 'chris': 0.27; 'message- id:@mail.gmail.com': 0.28; 'fri,': 0.29; 'certainly': 0.29; 'functions.': 0.29; 'probably': 0.30; 'all.': 0.30; "won't": 0.30; 'asynchronous': 0.31; 'i/o': 0.31; 'threads': 0.31; 'separate': 0.31; 'url:library': 0.31; 'to:addr:python-list': 0.32; 'url:docs': 0.33; 'module': 0.33; 'using': 0.34; 'there': 0.35; 'option.': 0.35; 'should': 0.37; 'received:209.85': 0.37; 'url:python': 0.37; 'run': 0.37; 'apr': 0.38; 'thread': 0.38; 'received:google.com': 0.38; 'but': 0.38; 'url:org': 0.38; 'so,': 0.38; 'received:209.85.214': 0.39; 'to:addr:python.org': 0.39; 'received:209': 0.39; "it's": 0.40; 'header:Received:5': 0.40; 'waiting': 0.61; 'results': 0.61; '2011': 0.62; 'ever': 0.65; 'fan': 0.67; 'met': 0.67; 'details:': 0.72; '1993': 0.84; 'isolate': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=rkfgQOihyshdvE1BkVgEDMmE3mDtZKegRvWscdckRH4=; b=SthQqPbWbAl7ZAUcPrxD1euoe37USq5YZg4scgPtohKp5ae6XGQmUydKQe62fvKLLj dGVVuzOGLCiKVyRArvcdbejypjHNrKzd6Ed/Hbz/RMh/Prg1WbCKi09Hm+Qz0Wg+ziML DbTCmlzGWiveAuLa3hLbjpy29bYOn82xTBU/E=
DomainKey-Signature a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=wDG4vYCEn7RW8p4VjSom0LP7QPd+Lk6OezbcLFfN1Zo4BYWNNldSjcsLsWsLSTJLpT 2CyLDpPMOfWE0dX/ud6dmmaTKRQJstoH/PvOSg8izkT+BcVb1dmzo3oRmM/EIKw3lePM 3ew8MQp2axn9CDDhl/DpN22WGkk14QMjCfrWo=
MIME-Version 1.0
In-Reply-To <BANLkTimHG5HfU-3Qs5QYFBxjwgya2rAr0Q@mail.gmail.com>
References <BANLkTimHG5HfU-3Qs5QYFBxjwgya2rAr0Q@mail.gmail.com>
Date Fri, 8 Apr 2011 17:25:20 +1000
Subject Re: Tips on Speeding up Python Execution
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.134.1302247524.9059.python-list@python.org> (permalink)
Lines 27
NNTP-Posting-Host 82.94.164.166
X-Trace 1302247524 news.xs4all.nl 81483 [::ffff:82.94.164.166]:56544
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:2828

Show key headers only | View raw


On Fri, Apr 8, 2011 at 5:04 PM, Abhijeet Mahagaonkar
<abhijeet.manohar@gmail.com> wrote:
> I was able to isolate that major chunk of run time is eaten up in opening a
> webpages, reading from them and extracting text.
> I wanted to know if there is a way to concurrently calling the functions.

So, to clarify: you have code that's loading lots of separate pages,
and the time is spent waiting for the internet? If you're saturating
your connection, then this won't help, but if they're all small pages
and they're coming over the internet, then yes, you certainly CAN
fetch them concurrently. As the Perl folks say, There's More Than One
Way To Do It; one is to spawn a thread for each request, then collect
up all the results at the end. Look up the 'threading' module for
details:

http://docs.python.org/library/threading.html

It should also be possible to directly use asynchronous I/O and
select(), but I couldn't see a way to do that with urllib/urllib2. If
you're using sockets directly, this ought to be an option.

I don't know what's the most Pythonesque option, but if you already
have specific Python code for each of your functions, it's probably
going to be easiest to spawn threads for them all.

Chris Angelico
Threading fan ever since he met OS/2 in 1993 or so

Back to comp.lang.python | Previous | NextNext in thread | Find similar


Thread

Re: Tips on Speeding up Python Execution Chris Angelico <rosuav@gmail.com> - 2011-04-08 17:25 +1000
  Re: Tips on Speeding up Python Execution Raymond Hettinger <python@rcn.com> - 2011-04-08 10:59 -0700

csiph-web