Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104823

Re: looping and searching in numpy array

From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Newsgroups comp.lang.python
Subject Re: looping and searching in numpy array
Date 2016-03-14 15:22 +0000
Message-ID <mailman.104.1457968972.12893.python-list@python.org> (permalink)
References <77bd470b-cc05-4117-9ed1-6309d7a5633a@googlegroups.com> <nbrr93$5ee$1@ger.gmane.org>

Show all headers | View raw


On 10 March 2016 at 13:02, Peter Otten <__peter__@web.de> wrote:
> Heli wrote:
>
>> I need to loop over a numpy array and then do the following search. The
>> following is taking almost 60(s) for an array (npArray1 and npArray2 in
>> the example below) with around 300K values.
>>
>>
>> for id in np.nditer(npArray1):
>>        newId=(np.where(npArray2==id))[0][0]

What are the dtypes of the arrays? And what are the typical sizes of
each of them. It can have a big effect on what makes a good solution
to the problem.

>> Is there anyway I can make the above faster? I need to run the script
>> above on much bigger arrays (50M). Please note that my two numpy arrays in
>> the lines above, npArray1 and npArray2  are not necessarily the same size,
>> but they are both 1d.
>
> You mean you are looking for the index of the first occurence in npArray2
> for every value of npArray1?
>
> I don't know how to do this in numpy (I'm not an expert), but even basic
> Python might be acceptable:

I'm not sure that numpy has any particular function that can be of use
here. Your approach below looks good though.

> lookup = {}
> for i, v in enumerate(npArray2):
>     if v not in lookup:
>         lookup[v] = i

Looking at this I wondered if there was a way to avoid the double hash
table lookup and realised it's the first time I've ever considered a
use for setdefault:

for i, v in enumerate(npArray2):
     lookup.setdefault(i, v)

Another option would be to use this same algorithm in Cython. Then you
can access the ndarray data pointer directly and loop over it in C.
This is the kind of scenario where that sort of thing can be well
worth doing.

--
Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 03:43 -0800
  Re: looping and searching in numpy array Peter Otten <__peter__@web.de> - 2016-03-10 14:02 +0100
    Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:48 -0800
      Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:50 -0800
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 13:51 +0000
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 15:43 +0000
  Re: looping and searching in numpy array Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-10 13:22 +0000
  Re: looping and searching in numpy array srinivas devaki <mr.eightnoteight@gmail.com> - 2016-03-14 10:19 +0530
  Re: looping and searching in numpy array Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-14 15:22 +0000

csiph-web