Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #74171 > unrolled thread

Re: NaN comparisons - Call For Anecdotes

Started byChris Angelico <rosuav@gmail.com>
First post2014-07-09 01:19 +1000
Last post2014-07-08 18:10 +0000
Articles 8 — 3 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: NaN comparisons - Call For Anecdotes Chris Angelico <rosuav@gmail.com> - 2014-07-09 01:19 +1000
    Re: NaN comparisons - Call For Anecdotes Marko Rauhamaa <marko@pacujo.net> - 2014-07-08 19:16 +0300
      Re: NaN comparisons - Call For Anecdotes Chris Angelico <rosuav@gmail.com> - 2014-07-09 02:27 +1000
        Re: NaN comparisons - Call For Anecdotes Marko Rauhamaa <marko@pacujo.net> - 2014-07-08 20:31 +0300
          Re: NaN comparisons - Call For Anecdotes Chris Angelico <rosuav@gmail.com> - 2014-07-09 03:54 +1000
          Re: NaN comparisons - Call For Anecdotes Chris Angelico <rosuav@gmail.com> - 2014-07-09 03:57 +1000
          Re: NaN comparisons - Call For Anecdotes Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-07-08 18:12 +0000
      Re: NaN comparisons - Call For Anecdotes Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-07-08 18:10 +0000

#74171 — Re: NaN comparisons - Call For Anecdotes

FromChris Angelico <rosuav@gmail.com>
Date2014-07-09 01:19 +1000
SubjectRe: NaN comparisons - Call For Anecdotes
Message-ID<mailman.11630.1404832805.18130.python-list@python.org>
On Wed, Jul 9, 2014 at 12:53 AM, Anders J. Munch <2014@jmunch.dk> wrote:
> In the end I came up with this hack: Every time I struct.unpack'd a
> float, I check if it's a NaN, and if it is, then I replace it with a
> reference to a single, shared, "canonical" NaN. That means that
> container objects that skip __equal__ when comparing an object to
> itself will work -- e.g. hash keys.

Let's take a step back.

No, let's take a step forward.

And let's take a step back again.

(And we're building a military-grade laser!)

Why *should* all NaNs be equal to each other? You said on the other
list that NaN==NaN was equivalent to (2+2)==(1+3), but that assumes
that NaN is a single "thing". It's really describing the whole huge
area of "stuff that just ain't numbers". Imagine if (x + y) wasn't 4,
but was "table". And (a + b) turned out to be "cyan". Does table equal
cyan, just because neither of them is a number? Certainly not. Neither
should (inf - inf) be equal to (inf / inf). Both of those expressions
evaluate to something that can't possibly be a number - it can't be
anywhere on the number line, it can't be anywhere on the complex
plane, it simply isn't a number. And they're not the same non-numeric
"thing".

For hash keys, float object identity will successfully look them up:
>>> d={}
>>> d[float("nan")]=1
>>> d[float("nan")]=2
>>> x=float("nan")
>>> d[x]=3
>>> d[x]
3
>>> d
{nan: 1, nan: 2, nan: 3}

So I'm not sure where the problems come from. You can iterate over a
dict's keys and look things up with them:

>>> for k,v in d.items():
    print(k,v,d[k])
nan 1 1
nan 2 2
nan 3 3

You can do a simple 'is' check as well as your equality check:

if x is y or x == y:
    print("They're the same")

But any time you compare floats for equality, you *already* have to
understand what you're doing (because of rounding and such), so I
don't see why the special case on NaN is significant, unless as
mentioned above, you want all NaNs to be equal, which doesn't make
sense.

ChrisA

[toc] | [next] | [standalone]


#74180

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-07-08 19:16 +0300
Message-ID<878uo3akqy.fsf@elektro.pacujo.net>
In reply to#74171
Chris Angelico <rosuav@gmail.com>:

> Why *should* all NaNs be equal to each other?

I appreciate why you can't say NaN is equal to NaN. However, shouldn't
the very comparison attempt trigger an arithmetic exception? After all,
so does a division by zero.


Marko

[toc] | [prev] | [next] | [standalone]


#74182

FromChris Angelico <rosuav@gmail.com>
Date2014-07-09 02:27 +1000
Message-ID<mailman.11640.1404836832.18130.python-list@python.org>
In reply to#74180
On Wed, Jul 9, 2014 at 2:16 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> Why *should* all NaNs be equal to each other?
>
> I appreciate why you can't say NaN is equal to NaN. However, shouldn't
> the very comparison attempt trigger an arithmetic exception? After all,
> so does a division by zero.

I'd say it would surprise people rather a lot if operations like dict
insertion/lookup could trigger arithmetic exceptions. :)

ChrisA

[toc] | [prev] | [next] | [standalone]


#74189

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-07-08 20:31 +0300
Message-ID<871ttvahaq.fsf@elektro.pacujo.net>
In reply to#74182
Chris Angelico <rosuav@gmail.com>:

> I'd say it would surprise people rather a lot if operations like dict
> insertion/lookup could trigger arithmetic exceptions. :)

That wouldn't trigger exceptions.

Dict operations do an "is" test before an "==" test. In fact, you
couldn't even use NaN as a dict key otherwise. Thus, dict operations
never test NaN == NaN.


Marko

[toc] | [prev] | [next] | [standalone]


#74192

FromChris Angelico <rosuav@gmail.com>
Date2014-07-09 03:54 +1000
Message-ID<mailman.11646.1404842102.18130.python-list@python.org>
In reply to#74189
On Wed, Jul 9, 2014 at 3:31 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> I'd say it would surprise people rather a lot if operations like dict
>> insertion/lookup could trigger arithmetic exceptions. :)
>
> That wouldn't trigger exceptions.
>
> Dict operations do an "is" test before an "==" test. In fact, you
> couldn't even use NaN as a dict key otherwise. Thus, dict operations
> never test NaN == NaN.

Check out the example I posted early in this thread of a dict with
three keys, all of them NaN. And note that hash(float("nan"))==0. Now
try looking up d[0]. Before it raises KeyError, it has to compare that
0 for equality with each of the nans, because it can't shortcut it
based on the hash. In fact, I can prove it thus:

>>> class X:
    def __eq__(self, other):
        if self is other:
            print("Comparing against self - I am me!")
            return True
        print("Comparing against",other,"-",id(other))
        return False
    def __hash__(self):
        return 0

>>> d[X()]
Comparing against nan - 18777952
Comparing against nan - 19624864
Comparing against nan - 18776272
Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    d[X()]
KeyError: <__main__.X object at 0x016B40D0>

Any lookup of anything with a hash of 0 will do this. 0 itself (as any
type of number), another NaN, or anything at all. For the dict to work
sanely, these comparisons have to work and be False.

ChrisA

[toc] | [prev] | [next] | [standalone]


#74193

FromChris Angelico <rosuav@gmail.com>
Date2014-07-09 03:57 +1000
Message-ID<mailman.11647.1404842254.18130.python-list@python.org>
In reply to#74189
On Wed, Jul 9, 2014 at 3:54 AM, Chris Angelico <rosuav@gmail.com> wrote:
>>>> class X:
>     def __eq__(self, other):
>         if self is other:
>             print("Comparing against self - I am me!")
>             return True
>         print("Comparing against",other,"-",id(other))
>         return False
>     def __hash__(self):
>         return 0
>
>>>> d[X()]
> Comparing against nan - 18777952
> Comparing against nan - 19624864
> Comparing against nan - 18776272
> Traceback (most recent call last):
>   File "<pyshell#20>", line 1, in <module>
>     d[X()]
> KeyError: <__main__.X object at 0x016B40D0>

Better example: Subclass float, then it can actually *be* a nan.

>>> class NoisyFloat(float):
    def __eq__(self, other):
        print("Comparing",id(self),"against",id(other))
        return super().__eq__(other)
    def __hash__(self):
        return super().__hash__()

>>> d[NoisyFloat("nan")]
Comparing 23777152 against 18777952
Comparing 23777152 against 19624864
Comparing 23777152 against 18776272
Traceback (most recent call last):
  File "<pyshell#35>", line 1, in <module>
    d[NoisyFloat("nan")]
KeyError: nan

That's comparing nan==nan three types with four different nans.

ChrisA

[toc] | [prev] | [next] | [standalone]


#74196

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-07-08 18:12 +0000
Message-ID<53bc3496$0$29995$c3e8da3$5496439d@news.astraweb.com>
In reply to#74189
On Tue, 08 Jul 2014 20:31:25 +0300, Marko Rauhamaa wrote:

> Thus, dict operations never test NaN == NaN

You're assuming that there is only one NAN instance. That is not correct:

py> a = float('nan')
py> b = float('nan')
py> a is b
False
py> a in {b: None}
False



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#74194

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-07-08 18:10 +0000
Message-ID<53bc3412$0$29995$c3e8da3$5496439d@news.astraweb.com>
In reply to#74180
On Tue, 08 Jul 2014 19:16:53 +0300, Marko Rauhamaa wrote:

> Chris Angelico <rosuav@gmail.com>:
> 
>> Why *should* all NaNs be equal to each other?
> 
> I appreciate why you can't say NaN is equal to NaN. However, shouldn't
> the very comparison attempt trigger an arithmetic exception? 

No. Why should it? It's not an error to check whether two things are 
equal.


> After all, so does a division by zero.

Um, yes, and multiplying by zero isn't an error. In what way is x == y 
related to x/0 ?


But having said that, sometimes it is useful to stop processing as soon 
as you reach a NAN. For that, IEEE-754 defines two kinds of NANs, Quiet 
NANs and Signalling NANs. Quiet NANs don't trigger a signal when you 
perform operations on them. (By default -- I believe you can enable 
signals if you wish.) Signalling NANs always trigger a signal, including 
when you check them for equality:


py> from decimal import Decimal as D
py> a = D('nan')
py> b = D('snan')
py> 1 == a
False
py> 1 == b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]


But by default, NANs are quiet. The C99 standard doesn't support 
signalling NANs, and Java actually prohibits them.

Aside: The influence of C and Java has crippled IEEE-754 support across 
almost all languages and hardware. It's a crying shame the pernicious 
influence those two languages have had.

http://grouper.ieee.org/groups/1788/email/pdfmPSi1DgZZf.pdf

Floating point is *hard*, and people who don't understand it insist on 
telling those who do that "you don't need that feature" :-(



-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web