Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104823

Re: looping and searching in numpy array

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Newsgroups comp.lang.python
Subject Re: looping and searching in numpy array
Date Mon, 14 Mar 2016 15:22:30 +0000
Lines 48
Message-ID <mailman.104.1457968972.12893.python-list@python.org> (permalink)
References <77bd470b-cc05-4117-9ed1-6309d7a5633a@googlegroups.com> <nbrr93$5ee$1@ger.gmane.org>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de PMk+BevMiV0YEQfnEE4REw1IuQv3yX/+87cr9IEKzcew==
Return-Path <oscar.j.benjamin@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.003
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'below)': 0.07; 'cc:addr :python-list': 0.09; 'lookup': 0.09; 'python': 0.10; 'anyway': 0.11; 'index': 0.13; 'size,': 0.13; "(i'm": 0.16; '2016': 0.16; 'arrays?': 0.16; 'cc:name:python list': 0.16; 'numpy': 0.16; 'occurence': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:array': 0.16; 'subject:looping': 0.16; 'to:addr:web.de': 0.16; 'wrote:': 0.16; 'pointer': 0.18; 'typical': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'algorithm': 0.20; 'arrays': 0.22; 'bigger': 0.23; 'header:In-Reply-To:1': 0.24; 'sort': 0.25; 'script': 0.25; "i've": 0.25; 'example': 0.26; 'message- id:@mail.gmail.com': 0.27; 'function': 0.28; 'looks': 0.29; 'hash': 0.29; 'search.': 0.29; 'array': 0.29; "i'm": 0.30; 'option': 0.31; 'another': 0.32; 'table': 0.32; 'run': 0.33; 'though.': 0.33; 'values.': 0.33; 'received:google.com': 0.35; 'problem.': 0.35; 'but': 0.36; 'there': 0.36; 'lines': 0.36; 'received:209.85': 0.36; 'basic': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'received:209': 0.38; 'mean': 0.38; 'data': 0.39; 'sure': 0.39; 'skip:e 20': 0.39; 'where': 0.40; 'ever': 0.60; 'your': 0.60; 'avoid': 0.61; 'here.': 0.62; 'above,': 0.63; 'necessarily': 0.63; 'march': 0.64; 'worth': 0.67; 'oscar': 0.84; 'otten': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=zZGpyeKWyXDdglxEKtpX98iLbiDt+glhi+Tb5Rydft4=; b=mGPx1G+Eax1OdahfcOrUOn8kjO4/aTcct5I1l87XlNUcHkbb1imgqaaPClD46RbQ3R JtLlElWC/bwP3RqfPxAGRkgN6K3QvvnULjRSIk/25qr6S+0WmXajAX5tTBFacLAxhMYW S+t9Pwdpun5/Fr9NRoStyKv38w7uIzKXcABdJSwH7nhLh1j3ASdwwZyxneldpRgrWB5e +Ah/bzXRqYY18pKIqPrzZjhwubCwtVDtIMff6YIWpraNFfabrtvkd9Ht9q79eh6mBeap aVPxHBU4qkjeLDfzPJwUwPcrIc83wk/M0ZCxJU36QCnEjqEpLnA9Ko/BQOwEqFxYIzlM r1JQ==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=zZGpyeKWyXDdglxEKtpX98iLbiDt+glhi+Tb5Rydft4=; b=iB2fZwUXlUFsNMIPMb7FJZtIXtRERUFEY4UbTiATitAdZCWmKYUeHXNSCERiZyuk2u Zii6KIZDHwowpQ2+9QvAr6VWXMpNm+B7cXw6y1ByXzQ3ER7R0AAv3Hy5FR3C9hrtLiE4 t+clzMuK69QzJVdeHaRmJ2bSMBmQjXZtk3AfMG5/vFyxlO051NRKzmz9+YdidF0d01pm tAagxiBHKis8sfQXhCGvXCwE9mrkrEI+tRUGAhiIcv8lAZ3bXveHgQ3JdK7LxUkPnTBd 10Gx1pbaOgdCkfS7zYAmCnVPh5HRGbWukLvwFVP9CQgLS134Occ5sBzpMIajMQt0rDV4 o6Uw==
X-Gm-Message-State AD7BkJLx7GwTn7UYY7oLdQO/4ttogcWc6hH4GR3IFTSaekA8KjpVHZQQlOn2XpHShOV95BhYOEC28817hAJnIg==
X-Received by 10.25.135.8 with SMTP id j8mr8116199lfd.64.1457968969797; Mon, 14 Mar 2016 08:22:49 -0700 (PDT)
In-Reply-To <nbrr93$5ee$1@ger.gmane.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:104823

Show key headers only | View raw


On 10 March 2016 at 13:02, Peter Otten <__peter__@web.de> wrote:
> Heli wrote:
>
>> I need to loop over a numpy array and then do the following search. The
>> following is taking almost 60(s) for an array (npArray1 and npArray2 in
>> the example below) with around 300K values.
>>
>>
>> for id in np.nditer(npArray1):
>>        newId=(np.where(npArray2==id))[0][0]

What are the dtypes of the arrays? And what are the typical sizes of
each of them. It can have a big effect on what makes a good solution
to the problem.

>> Is there anyway I can make the above faster? I need to run the script
>> above on much bigger arrays (50M). Please note that my two numpy arrays in
>> the lines above, npArray1 and npArray2  are not necessarily the same size,
>> but they are both 1d.
>
> You mean you are looking for the index of the first occurence in npArray2
> for every value of npArray1?
>
> I don't know how to do this in numpy (I'm not an expert), but even basic
> Python might be acceptable:

I'm not sure that numpy has any particular function that can be of use
here. Your approach below looks good though.

> lookup = {}
> for i, v in enumerate(npArray2):
>     if v not in lookup:
>         lookup[v] = i

Looking at this I wondered if there was a way to avoid the double hash
table lookup and realised it's the first time I've ever considered a
use for setdefault:

for i, v in enumerate(npArray2):
     lookup.setdefault(i, v)

Another option would be to use this same algorithm in Cython. Then you
can access the ndarray data pointer directly and loop over it in C.
This is the kind of scenario where that sort of thing can be well
worth doing.

--
Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 03:43 -0800
  Re: looping and searching in numpy array Peter Otten <__peter__@web.de> - 2016-03-10 14:02 +0100
    Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:48 -0800
      Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:50 -0800
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 13:51 +0000
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 15:43 +0000
  Re: looping and searching in numpy array Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-10 13:22 +0000
  Re: looping and searching in numpy array srinivas devaki <mr.eightnoteight@gmail.com> - 2016-03-14 10:19 +0530
  Re: looping and searching in numpy array Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-14 15:22 +0000

csiph-web