Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'algorithm': 0.04; 'argument': 0.05; 'element': 0.07; 'indexing': 0.07; 'processing.': 0.07; 'ugly': 0.07; 'arrays': 0.09; 'logic': 0.09; 'to)': 0.09; 'worse': 0.09; 'subject:How': 0.10; 'cc:addr :python-list': 0.11; 'python': 0.11; 'def': 0.12; '(just': 0.16; '(when': 0.16; '(without': 0.16; 'clear.': 0.16; "guido's": 0.16; 'itself,': 0.16; 'least)': 0.16; 'loops': 0.16; 'numpy': 0.16; 'out)': 0.16; 'reason.': 0.16; 'subject:make': 0.16; 'transforming': 0.16; 'discussions': 0.16; 'index': 0.16; 'wrote:': 0.18; 'numerical': 0.19; "python's": 0.19; 'meant': 0.20; '(the': 0.22; 'machine': 0.22; '>>>': 0.22; 'import': 0.22; 'cc:addr:python.org': 0.22; 'fine': 0.24; 'math': 0.24; 'cc:2**0': 0.24; 'cc:no real name:2**0': 0.24; "i've": 0.25; 'header:In- Reply-To:1': 0.27; 'function': 0.29; 'array': 0.29; "doesn't": 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; "d'aprano": 0.31; 'object.': 0.31; 'python).': 0.31; 'steven': 0.31; 'lists': 0.32; 'this.': 0.32; 'stuff': 0.32; 'fri,': 0.33; "i'd": 0.34; 'something': 0.35; 'case,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'add': 0.35; 'doing': 0.36; 'should': 0.36; 'list': 0.37; 'e.g.': 0.38; 'expect': 0.39; 'sure': 0.39; 'how': 0.40; 'referred': 0.60; 'new': 0.61; 'simply': 0.61; "you're": 0.61; 'real': 0.63; 'july': 0.63; 'skip:n 10': 0.64; 'more': 0.64; 'jul': 0.74; 'subject:this': 0.83; 'calculations': 0.84; 'functions)': 0.84; 'oscar': 0.84; 'way)': 0.84; 'whereas': 0.91; 'choice.': 0.93; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=puXJWwc98Dje8KCbn87Zgn52Zrz39l1eEC2Ns9NSDqU=; b=ZxebWfDot52E/B4oC3w4NQV5mPjEJWxmWItsEy07vwHQR8MtKu5qCRSHtYm2dmKXcY rfyrInMz30QLXhjU+QeSHCBufiHgnITi7z/abNK1f1g1QgSnUgW3UJj5BcBS3F32gcVW dT0nDAklSZiX+lBVpmASqvLOBNYZiPZ0pMZaOQiD12kho/mYXf5zJ6Iy19Kfzx/sWV4k rj02kZVZ8Q6got8Qd15qRP6wGqqATGHDUOZfUqtOJiR8LY0pFr6V/T/d5ky0pllAdZ+L fT9fbjxw/0P7ZxbHHFZULyB4CD6i712xJGuPce8YNlMSSAzhVYai8uss+EoiQYu5I/8i wMMA== X-Received: by 10.58.118.200 with SMTP id ko8mr7098958veb.94.1373038001437; Fri, 05 Jul 2013 08:26:41 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Oscar Benjamin Date: Fri, 5 Jul 2013 16:26:20 +0100 Subject: Re: How to make this faster To: Helmut Jarausch Content-Type: text/plain; charset=ISO-8859-1 Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 56 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1373038497 news.xs4all.nl 15996 [2001:888:2000:d::a6]:58066 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:49994 On 5 July 2013 15:48, Helmut Jarausch wrote: > On Fri, 05 Jul 2013 12:02:21 +0000, Steven D'Aprano wrote: > >> On Fri, 05 Jul 2013 10:53:35 +0000, Helmut Jarausch wrote: >> >>> Since I don't do any numerical stuff with the arrays, Numpy doesn't seem >>> to be a good choice. I think this is an argument to add real arrays to >>> Python. >> >> Guido's time machine strikes again: >> >> import array >> >> >> By the way, I'm not exactly sure how you go from "I don't do numerical >> calculations on numpy arrays" to "therefore Python should have arrays". > > I should have been more clear. I meant multi-dimensional arrays (2D, at least) > Numpy is fine if I do math with matrices (without loops in python). > > Given that I don't like to use the old FORTRAN way (when "dynamic" arrays are passed to > functions) of indexing a 2-d array I would need a MACRO or an INLINED function in Python > or something like a META-compiler phase transforming > > def access2d(v,i,j,dim1) : # doesn't work on the l.h.s. > return v[i*dim1+j] > > access2d(v,i,j,dim1) = 7 # at compile time, please > > to > > v[i*dim1+j]= 7 # this, by itself, is considered ugly (the FORTRAN way) The list of lists approach works fine for what you're doing. I don't think that a[r][c] is that much worse than a[r, c]. It's only when you want to do something like a[:, c] that it breaks down. In any case, your algorithm would work better with Python's set/dict/list types than numpy arrays. One of the reasons that it's faster to use lists than numpy arrays (as you found out) is precisely because the N-dimensional array logic complicates 1-dimensional processing. I've seen discussions in Cython and numpy about lighter-weight 1-dimensional array types for this reason. The other reason that numpy arrays are slower for what you're doing is that (just like the stdlib array type Steven referred to) they use homogeneous types in a contiguous buffer and each element is not a Python object in its own right until you access it with e.g. a[0]. That means that the numpy array has to create a new object every time you index into it whereas the list can simply return a new reference to an existing object. You can get the same effect with numpy arrays by using dtype=object but I'd still expect it to be slower for this. Oscar