Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104799

Re: looping and searching in numpy array

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From srinivas devaki <mr.eightnoteight@gmail.com>
Newsgroups comp.lang.python
Subject Re: looping and searching in numpy array
Date Mon, 14 Mar 2016 10:19:44 +0530
Lines 70
Message-ID <mailman.85.1457930987.12893.python-list@python.org> (permalink)
References <77bd470b-cc05-4117-9ed1-6309d7a5633a@googlegroups.com> <CACs7g=DL8UExWQYJOSbFEQDvNvY3M46MRP==jFBisS4tytxFxg@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de xE1k4GNshdJCvlIX8hHBOwCXHbj8S0GA/3bFSPbgmRxw==
Return-Path <mr.eightnoteight@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.012
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'else:': 0.03; 'below)': 0.07; 'cc:addr:python-list': 0.09; 'dict': 0.09; 'anyway': 0.11; 'size,': 0.13; '+91': 0.15; '-1:': 0.16; '2016': 0.16; 'numpy': 0.16; 'optimised': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:array': 0.16; 'subject:looping': 0.16; 'wrote:': 0.16; '&lt;': 0.18; 'element': 0.18; 'email addr:gmail.com&gt;': 0.18; 'all,': 0.20; 'student': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'algorithm': 0.20; 'arrays': 0.22; 'junior': 0.22; '8bit%:5': 0.23; 'absolute': 0.23; 'bigger': 0.23; 'header:In-Reply-To:1': 0.24; 'all.': 0.24; 'script': 0.25; "doesn't": 0.26; 'example': 0.26; 'message-id:@mail.gmail.com': 0.27; 'search.': 0.29; 'array': 0.29; 'url:mailman': 0.30; 'code': 0.30; 'related': 0.32; 'help,': 0.32; 'run': 0.33; 'problem': 0.33; 'url:python': 0.33; 'values.': 0.33; 'url:listinfo': 0.34; 'received:google.com': 0.35; 'but': 0.36; 'there': 0.36; 'url:org': 0.36; 'lines': 0.36; 'received:209.85': 0.36; 'cases': 0.36; 'indian': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'skip:& 10': 0.37; 'thanks': 0.37; 'received:209.85.213': 0.37; 'doing': 0.38; 'difference': 0.38; 'received:209': 0.38; 'minimum': 0.38; 'means': 0.39; 'data': 0.39; 'skip:e 20': 0.39; 'url:mail': 0.40; 'your': 0.60; 'maximum': 0.61; 'school': 0.62; 'skip:n 10': 0.62; 'above,': 0.63; 'necessarily': 0.63; 'between': 0.65; 'mar': 0.65; 'better.': 0.66; 'skip:\xc2 10': 0.67; 'dear': 0.67; '8bit%:100': 0.70; '(3rd': 0.84; '5:15': 0.84; 'complexity': 0.84; 'ph:': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=4rKS82e0ChjhttRw38IT9Z58Fb++2TbR7TuQ/AEkKuk=; b=Dfyfg/nvvscU3ULv8XieqAxgGAHPXp5XR0hK++XNxdJIsH+KizfpHT/gEqq0NF+E5l zZBStkGWdko2XOAuPsEcJQJUK1MMX0N5OiLGB2cgvQd7jB9rWBs05NSTTjWRioouLEM4 xl5q5PDRq6qd8qWVnGGMLJ1UoGtq0f3MYuf7kiuEptjZN3Tr6yEGOiUI5+Wt12QynmOn ZtgzHrGfe2Dy0Fng928yZbYFKR1onn/IJxQIsLxFoSsICR74PqLhOO6hefly3DFi/ixA Hz5qQWbwTy9drIel0l6Bz4Hhtsz50+8pJ083Byzgc0o9ttNPY8gt4Hy4/s5PVR8gIwnA rwZA==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=4rKS82e0ChjhttRw38IT9Z58Fb++2TbR7TuQ/AEkKuk=; b=gSFt9mEAieKxRT+uVRn+Fz6JkAZ4GzAQAsTylrRCwSKb41HoC/AWJGcDS2zzWoCb49 EKaA7FtEPjKyH+P0pO3/d5LpHw7dvd3cNwnfCDPHiKLDDpPnsa7tpSrahm6BcE5PWkrY MQJvRNBYCD+cGNMqHRgUnpUgxDHkKMvR5+1XmGY7EHd23qSKr+IktKYn+bjgU5/R/xrk 5v19djC98WA3191H9Z520Fd3B4pfK1lgbr9URCFM5mJhzQGcWFsXhb03pfbikAYKXntT 4O1fbWNXJfcGchDAiZWjdEaxP9ZMFnsAsXGg+Ff11TOsy+u8cVISHu4pbAxyyqfxGbS8 w+zw==
X-Gm-Message-State AD7BkJKCHLMNcGpbJVN+EHV24YgFmMu5zbushD+iahn0Gop92N9ye/TZXAp59G5vk9fUXOCl41YITWvhTjMfAQ==
X-Received by 10.50.178.180 with SMTP id cz20mr14740039igc.44.1457930984751; Sun, 13 Mar 2016 21:49:44 -0700 (PDT)
In-Reply-To <CACs7g=DL8UExWQYJOSbFEQDvNvY3M46MRP==jFBisS4tytxFxg@mail.gmail.com>
X-Content-Filtered-By Mailman/MimeDel 2.1.21
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:104799

Show key headers only | View raw


problem is infact not related to numpy at all. the complexity of your
algorithm is O(len(npArray1) * len(npArray2))

which means the number of computations that you are doing is in the range
of 10**10,

if the absolute difference between the maximum element and minimum element
is less than 10**6, you can improve your code by pre-computing the first
occurrence of a number by using an array of size of that difference(afore
mentioned).

if your npArray2 doesn't have such a pattern, you have to precompute it by
using a dict (I don't know if numpy has such data structure)

an optimised pseudo code would look like

mmdiff = max(npArray2) - min(npArray2)
if mmdiff < 10**6:
    precom = np.array([-1] * mmdiff)
    offset = min(npArray2)
    for i, x in enumerate(npArray2):
        precom[x - offset] = i
    for id in npArray1:
        if 0 <= id - offset < mmdiff and precom[id - offset] != -1:
            new_id = precom[id]
            # your code
else:
    precom = {}
    for i, x in enumerate(npArray1):
        if x not in precom:
            precom[x] = i
    for id in npArray1:
        if id in precom:
            new_id = precom[id]
            # your code


you can just use the else case which will work for all cases but if your
npArray2 has such a pattern then the above code will perform better.

Regards
Srinivas Devaki
Junior (3rd yr) student at Indian School of Mines,(IIT Dhanbad)
Computer Science and Engineering Department
ph: +91 9491 383 249
telegram_id: @eightnoteight
On Mar 10, 2016 5:15 PM, "Heli" <hemla21@gmail.com> wrote:

Dear all,

I need to loop over a numpy array and then do the following search. The
following is taking almost 60(s) for an array (npArray1 and npArray2 in the
example below) with around 300K values.


for id in np.nditer(npArray1):

       newId=(np.where(npArray2==id))[0][0]


Is there anyway I can make the above faster? I need to run the script above
on much bigger arrays (50M). Please note that my two numpy arrays in the
lines above, npArray1 and npArray2  are not necessarily the same size, but
they are both 1d.


Thanks a lot for your help,

--
https://mail.python.org/mailman/listinfo/python-list

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 03:43 -0800
  Re: looping and searching in numpy array Peter Otten <__peter__@web.de> - 2016-03-10 14:02 +0100
    Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:48 -0800
      Re: looping and searching in numpy array Heli <hemla21@gmail.com> - 2016-03-10 08:50 -0800
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 13:51 +0000
      RE: looping and searching in numpy array Albert-Jan Roskam <sjeik_appie@hotmail.com> - 2016-03-13 15:43 +0000
  Re: looping and searching in numpy array Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-10 13:22 +0000
  Re: looping and searching in numpy array srinivas devaki <mr.eightnoteight@gmail.com> - 2016-03-14 10:19 +0530
  Re: looping and searching in numpy array Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-14 15:22 +0000

csiph-web