Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #61703

Re: Optimizing list processing

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.015
X-Spam-Evidence '*H*': 0.97; '*S*': 0.00; 'else:': 0.03; 'algorithm': 0.04; 'assignment': 0.07; 'suddenly': 0.07; 'cc:addr:python-list': 0.11; "wouldn't": 0.14; '1048576': 0.16; 'determining': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'ideally,': 0.16; 'list"': 0.16; 'ought': 0.16; 'reasonably': 0.16; 'splits': 0.16; 'throw': 0.16; 'index': 0.16; 'wrote:': 0.18; 'trying': 0.19; 'slightly': 0.19; 'split': 0.19; 'thu,': 0.19; 'cc:addr:python.org': 0.22; 'cc:2**0': 0.24; 'gets': 0.27; 'header:In-Reply-To:1': 0.27; 'point': 0.28; '[1]': 0.29; 'room': 0.29; 'dec': 0.30; 'subject:list': 0.30; 'message- id:@mail.gmail.com': 0.30; "i'm": 0.30; '(which': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'critical': 0.32; 'table': 0.34; "i'd": 0.34; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'really': 0.36; 'done': 0.36; 'useful': 0.36; 'possible': 0.36; 'should': 0.36; 'half': 0.37; 'too': 0.37; 'list': 0.37; 'performance': 0.37; 'pm,': 0.38; 'does': 0.39; '12,': 0.39; 'generating': 0.39; 'even': 0.60; 'length': 0.61; 'lower': 0.61; 'matter': 0.61; "you're": 0.61; 'pick': 0.64; 'total': 0.65; 'p.s.': 0.66; 'comparable': 0.84; 'dealer': 0.84; 'high,': 0.84; 'wins...': 0.84; 'to:none': 0.92; 'rank': 0.93; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=7ZA2pwtpsW4iAUDRPUib4XhvOAtp0ZY+HBc4F/yPF8Y=; b=JVuHpM5XGiKEbVMz4xgZXCtJdC3yRmrPHAJJmkEOd2ch4wSZjA7zWEcYzP/f+jXJu1 rL5OH2DmfsoIwgfWhNxbUeBWjUqzntTdsDq5Bls6KNYl5gmSMQUd1EuPHL5Ax3LAar6D ARlcQapzUs7UKdHrJMvhiSi4NXGHnhu6AnirA8PaIL6XvmnaWMqHILsYSQNlzNIAdbHQ mvtedHfj020h+4j9uRpffwmytTLJipZf73EEM60OPEBBCdNO1qUlxz7Bt11lyTPNbJII +0Q/5lRB7B5vbfGBBgo+I5nE0wMRwlx3/3gJuBCdA5KlSmlUv8f/zvhl0xhdpF5s9M8k 4I2A==
MIME-Version 1.0
X-Received by 10.68.32.231 with SMTP id m7mr11831684pbi.22.1386851137399; Thu, 12 Dec 2013 04:25:37 -0800 (PST)
In-Reply-To <52a9a75a$0$29992$c3e8da3$5496439d@news.astraweb.com>
References <52a8fb2d$0$29992$c3e8da3$5496439d@news.astraweb.com> <mailman.3953.1386815228.18130.python-list@python.org> <52a9a75a$0$29992$c3e8da3$5496439d@news.astraweb.com>
Date Thu, 12 Dec 2013 23:25:37 +1100
Subject Re: Optimizing list processing
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3984.1386851879.18130.python-list@python.org> (permalink)
Lines 41
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1386851879 news.xs4all.nl 2961 [2001:888:2000:d::a6]:45553
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:61703

Show key headers only | View raw


On Thu, Dec 12, 2013 at 11:08 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> P.S. The algorithm I'm working on is a way of generating index and rank
> tables. Not that it really matters -- what matters is determining whether
> or not to shift from "make a copy of the list" to "modify the list in
> place".

So you're currently looking at...

if len(table) < ?????:
    table = [i for x,i in table]
else:
    for x, i in table:
        table[i] = x


Can I throw a spanner [1] in the works with other suggestions to try timing?

table[:] = [i for x,i in table]  # Does slice assignment get optimized?

SPLIT = 1048576  # Pick some useful cutoff
for part in range(0,len(table),SPLIT):
    table[part:part+SPLIT] = [i for x,i in table[part:part+SPLIT]]

If slice assignment is reasonably fast (which I suspect it is), the
one-liner should be practically identical to your small-table
one-liner. Then if the million-record splits can be done inside
memory, it ought to be possible to do this in comparable time, even if
the total table length is huge. The choice of SPLIT would then matter
a lot less than the cutoff that you're trying to find; you might have
been able to do it in half the number of sections, but that won't make
as big a difference as suddenly paging out. Ideally, what I'd like to
see is that a higher SPLIT improves performance slightly until it gets
too high, at which point you go bust and the dealer wins... but the
critical word there is "slightly", meaning that it wouldn't cost too
much for SPLIT to be lower than it needs to be. That's the criteria
for the experiment; do you have the data on which to try it?

[1] Monkey wrench, for the Yanks in the room

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Optimizing list processing Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-11 23:54 +0000
  Re: Optimizing list processing MRAB <python@mrabarnett.plus.com> - 2013-12-12 00:59 +0000
    Re: Optimizing list processing Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-12 01:43 +0000
      Re: Optimizing list processing MRAB <python@mrabarnett.plus.com> - 2013-12-12 02:09 +0000
  Re: Optimizing list processing duncan smith <buzzard@invalid.invalid> - 2013-12-12 01:02 +0000
  Re: Optimizing list processing Ben Finney <ben+python@benfinney.id.au> - 2013-12-12 12:18 +1100
  Re: Optimizing list processing Terry Reedy <tjreedy@udel.edu> - 2013-12-11 21:26 -0500
    Re: Optimizing list processing Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-12 12:08 +0000
      Re: Optimizing list processing Chris Angelico <rosuav@gmail.com> - 2013-12-12 23:25 +1100
      Re: Optimizing list processing MRAB <python@mrabarnett.plus.com> - 2013-12-12 13:32 +0000
      Re: Optimizing list processing Chris Angelico <rosuav@gmail.com> - 2013-12-13 01:06 +1100
      Re: Optimizing list processing Terry Reedy <tjreedy@udel.edu> - 2013-12-12 13:40 -0500
        Re: Optimizing list processing Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-13 00:14 +0000
          Re: Optimizing list processing Chris Angelico <rosuav@gmail.com> - 2013-12-13 12:01 +1100
  Re: Optimizing list processing Stefan Behnel <stefan_ml@behnel.de> - 2013-12-12 12:09 +0100
  Re: Optimizing list processing Peter Otten <__peter__@web.de> - 2013-12-12 16:08 +0100
    Re: Optimizing list processing Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-13 03:01 +0000
      Re: Optimizing list processing rusi <rustompmody@gmail.com> - 2013-12-12 21:35 -0800
  Re: Optimizing list processing Terry Reedy <tjreedy@udel.edu> - 2013-12-12 13:07 -0500

csiph-web