Groups > comp.lang.python > #64211 > unrolled thread

numpy.where() and multiple comparisons

Started by	John Ladasky <john_ladasky@sbcglobal.net>
First post	2014-01-17 17:51 -0800
Last post	2014-01-18 19:12 -0500
Articles	6 — 5 participants

Back to article view | Back to comp.lang.python

  numpy.where() and multiple comparisons John Ladasky <john_ladasky@sbcglobal.net> - 2014-01-17 17:51 -0800
    Re: numpy.where() and multiple comparisons duncan smith <buzzard@invalid.invalid> - 2014-01-18 02:16 +0000
      Re: numpy.where() and multiple comparisons John Ladasky <john_ladasky@sbcglobal.net> - 2014-01-17 20:00 -0800
        Re: numpy.where() and multiple comparisons Peter Otten <__peter__@web.de> - 2014-01-18 09:50 +0100
          Re: numpy.where() and multiple comparisons Tim Roberts <timr@probo.com> - 2014-01-18 13:20 -0800
        'and' is not exactly an 'operator' (was Re: numpy.where() and multiple comparisons) Terry Reedy <tjreedy@udel.edu> - 2014-01-18 19:12 -0500

#64211 — numpy.where() and multiple comparisons

From	John Ladasky <john_ladasky@sbcglobal.net>
Date	2014-01-17 17:51 -0800
Subject	numpy.where() and multiple comparisons
Message-ID	<5617a90f-3d9b-48b2-b449-9e5ef4c181e5@googlegroups.com>

Hi folks,

I am awaiting my approval to join the numpy-discussion mailing list, at scipy.org.  I realize that would be the best place to ask my question.  However, numpy is so widely used, I figure that someone here would be able to help.

I like to use numpy.where() to select parts of arrays.  I have encountered what I would consider to be a bug when you try to use where() in conjunction with the multiple comparison syntax of Python.  Here's a minimal example:

Python 3.3.2+ (default, Oct  9 2013, 14:50:09) 
[GCC 4.8.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = np.where(a < 5)
>>> b
(array([0, 1, 2, 3, 4]),)
>>> c = np.where(2 < a < 7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Defining b works as I want and expect.  The array contains the indices (not the values) of a where a < 5.

For my definition of c, I expect (array([3, 4, 5, 6]),).  As you can see, I get a ValueError instead.  I have seen the error message about "the truth value of an array with more than one element" before, and generally I understand how I (accidentally) provoke it.  This time, I don't see it.  In defining c, I expect to be stepping through a, one element at a time, just as I did when defining b.

Does anyone understand why this happens?  Is there a smart work-around?  Thanks.

[toc] | [next] | [standalone]

#64212

From	duncan smith <buzzard@invalid.invalid>
Date	2014-01-18 02:16 +0000
Message-ID	<52d9e408$0$29769$862e30e2@ngroups.net>
In reply to	#64211

On 18/01/14 01:51, John Ladasky wrote:
> Hi folks,
>
> I am awaiting my approval to join the numpy-discussion mailing list, at scipy.org.  I realize that would be the best place to ask my question.  However, numpy is so widely used, I figure that someone here would be able to help.
>
> I like to use numpy.where() to select parts of arrays.  I have encountered what I would consider to be a bug when you try to use where() in conjunction with the multiple comparison syntax of Python.  Here's a minimal example:
>
> Python 3.3.2+ (default, Oct  9 2013, 14:50:09)
> [GCC 4.8.1] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import numpy as np
>>>> a = np.arange(10)
>>>> a
> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>>> b = np.where(a < 5)
>>>> b
> (array([0, 1, 2, 3, 4]),)
>>>> c = np.where(2 < a < 7)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>
> Defining b works as I want and expect.  The array contains the indices (not the values) of a where a < 5.
>
> For my definition of c, I expect (array([3, 4, 5, 6]),).  As you can see, I get a ValueError instead.  I have seen the error message about "the truth value of an array with more than one element" before, and generally I understand how I (accidentally) provoke it.  This time, I don't see it.  In defining c, I expect to be stepping through a, one element at a time, just as I did when defining b.
>
> Does anyone understand why this happens?  Is there a smart work-around?  Thanks.
>


 >>> a = np.arange(10)
 >>> c = np.where((2 < a) & (a < 7))
 >>> c
(array([3, 4, 5, 6]),)
 >>>

Duncan

[toc] | [prev] | [next] | [standalone]

#64215

From	John Ladasky <john_ladasky@sbcglobal.net>
Date	2014-01-17 20:00 -0800
Message-ID	<39d6bb6a-ce34-469e-8cb3-24a14331d6c5@googlegroups.com>
In reply to	#64212

On Friday, January 17, 2014 6:16:28 PM UTC-8, duncan smith wrote:

>  >>> a = np.arange(10)
>  >>> c = np.where((2 < a) & (a < 7))
>  >>> c
> (array([3, 4, 5, 6]),)

Nice!  Thanks!

Now, why does the multiple comparison fail, if you happen to know?

[toc] | [prev] | [next] | [standalone]

#64220

From	Peter Otten <__peter__@web.de>
Date	2014-01-18 09:50 +0100
Message-ID	<mailman.5673.1390034994.18130.python-list@python.org>
In reply to	#64215

John Ladasky wrote:

> On Friday, January 17, 2014 6:16:28 PM UTC-8, duncan smith wrote:
> 
>>  >>> a = np.arange(10)
>>  >>> c = np.where((2 < a) & (a < 7))
>>  >>> c
>> (array([3, 4, 5, 6]),)
> 
> Nice!  Thanks!
> 
> Now, why does the multiple comparison fail, if you happen to know?

2 < a < 7

is equivalent to

2 < a and a < 7

Unlike `&` `and` cannot be overridden (*), so the above implies that the 
boolean value bool(2 < a) is evaluated. That triggers the error because the 
numpy authors refused to guess -- and rightly so, as both implementable 
options would be wrong in a common case like yours.

(*) I assume overriding would collide with short-cutting of boolean 
expressions.

[toc] | [prev] | [next] | [standalone]

#64258

From	Tim Roberts <timr@probo.com>
Date	2014-01-18 13:20 -0800
Message-ID	<iqrld951pjnptvr9767khlv4v859qm679c@4ax.com>
In reply to	#64220

Peter Otten <__peter__@web.de> wrote:

>John Ladasky wrote:
>
>> On Friday, January 17, 2014 6:16:28 PM UTC-8, duncan smith wrote:
>> 
>>>  >>> a = np.arange(10)
>>>  >>> c = np.where((2 < a) & (a < 7))
>>>  >>> c
>>> (array([3, 4, 5, 6]),)
>> 
>> Nice!  Thanks!
>> 
>> Now, why does the multiple comparison fail, if you happen to know?
>
>2 < a < 7
>
>is equivalent to
>
>2 < a and a < 7
>
>Unlike `&` `and` cannot be overridden (*),,,,

And just in case it isn't obvious to the original poster, the expression "2
< a" only works because the numpy.array class has an override for the "<"
operator.  Python natively has no idea how to compare an integer to a
numpy.array object.

Similarly, (2 < a) & (a > 7) works because numpy.array has an override for
the "&" operator.  So, that expression is compiled as

    numpy.array.__and__(
        numpy.array.__lt__(2, a),
        numpy.array.__lt__(a, 7)
    )

As Peter said, there's no way to override the "and" operator.
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]

#64271 — 'and' is not exactly an 'operator' (was Re: numpy.where() and multiple comparisons)

From	Terry Reedy <tjreedy@udel.edu>
Date	2014-01-18 19:12 -0500
Subject	'and' is not exactly an 'operator' (was Re: numpy.where() and multiple comparisons)
Message-ID	<mailman.5696.1390090373.18130.python-list@python.org>
In reply to	#64215

On 1/18/2014 3:50 AM, Peter Otten wrote:

> Unlike `&` `and` cannot be overridden (*),

> (*) I assume overriding would collide with short-cutting of boolean
> expressions.

Yes. 'and' could be called a 'control-flow operator', but in Python it 
is not a functional operator.

A functional binary operator expression like 'a + b' abbreviates a 
function call, without using (). In this case, it could be written 
'operator.add(a,b)'. This function, or it internal equivalent, calls 
either a.__add__(b) or b.__radd__(a) or both. It is the overloading of 
the special methods that overrides the operator.

The control flow expression 'a and b' cannot abbreviate a function call 
because Python calls always evaluate all arguments first. It is 
equivalent* to the conditional (control flow) *expression* (also not a 
function operator) 'a if not a else b'. Evaluation of either expression 
calls bool(a) and hence a.__bool__ or a.__len__.

'a or b' is equivalent* to 'a if a else b'

* 'a (and/or) b' evaluates 'a' once, whereas 'a if (not/)a else b' 
evaluates 'a' twice. This is not equivalent when there are side-effects. 
Here is an example where this matters.
  input('enter a non-0 number :') or 1

-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]

csiph-web

numpy.where() and multiple comparisons

Contents

#64211 — numpy.where() and multiple comparisons

#64212

#64215

#64220

#64258

#64271 — 'and' is not exactly an 'operator' (was Re: numpy.where() and multiple comparisons)