Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #6498 > unrolled thread
| Started by | MRAB <python@mrabarnett.plus.com> |
|---|---|
| First post | 2011-05-29 00:41 +0100 |
| Last post | 2011-05-29 21:49 -0700 |
| Articles | 20 on this page of 37 — 12 participants |
Back to article view | Back to comp.lang.python
float("nan") in set or as key MRAB <python@mrabarnett.plus.com> - 2011-05-29 00:41 +0100
Re: float("nan") in set or as key Erik Max Francis <max@alcyone.com> - 2011-05-28 17:16 -0700
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-05-29 00:26 +0000
Re: float("nan") in set or as key Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-05-29 13:04 +1200
Re: float("nan") in set or as key John Nagle <nagle@animats.com> - 2011-05-28 23:12 -0700
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-05-29 10:29 +0000
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-05-29 22:19 +0100
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-05-29 23:31 +0000
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-01 21:41 +0100
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-02 09:54 +0000
Re: float("nan") in set or as key Grant Edwards <invalid@invalid.invalid> - 2011-06-02 13:05 +0000
Re: float("nan") in set or as key Robert Kern <robert.kern@gmail.com> - 2011-06-02 12:04 -0500
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-02 21:47 +0100
Re: float("nan") in set or as key Grant Edwards <invalid@invalid.invalid> - 2011-06-03 14:52 +0000
Re: float("nan") in set or as key Chris Torek <nospam@torek.net> - 2011-06-03 17:52 +0000
Re: float("nan") in set or as key Grant Edwards <invalid@invalid.invalid> - 2011-06-06 13:54 +0000
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-04 00:29 +0100
Re: float("nan") in set or as key Chris Angelico <rosuav@gmail.com> - 2011-06-04 09:51 +1000
Re: float("nan") in set or as key rusi <rustompmody@gmail.com> - 2011-06-04 00:52 -0700
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-04 20:29 +0100
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-05 07:21 +0000
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-05 19:15 +0100
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-06 00:55 +0000
Re: float("nan") in set or as key Nobody <nobody@nowhere.com> - 2011-06-06 23:14 +0100
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-06 23:44 +0000
Re: float("nan") in set or as key Chris Angelico <rosuav@gmail.com> - 2011-06-07 11:00 +1000
Re: float("nan") in set or as key Grant Edwards <invalid@invalid.invalid> - 2011-06-06 14:03 +0000
Re: float("nan") in set or as key Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-06-03 11:17 +1200
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-03 04:23 +0000
Re: float("nan") in set or as key Chris Angelico <rosuav@gmail.com> - 2011-06-03 14:35 +1000
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-03 05:59 +0000
Re: float("nan") in set or as key Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-06-04 12:14 +1200
Re: float("nan") in set or as key Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-04 02:21 +0000
Re: float("nan") in set or as key Erik Max Francis <max@alcyone.com> - 2011-06-05 00:27 -0700
Re: float("nan") in set or as key Grant Edwards <invalid@invalid.invalid> - 2011-06-01 21:01 +0000
Re: float("nan") in set or as key Chris Torek <nospam@torek.net> - 2011-05-30 00:02 +0000
Re: float("nan") in set or as key Raymond Hettinger <python@rcn.com> - 2011-05-29 21:49 -0700
Page 1 of 2 [1] 2 Next page →
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2011-05-29 00:41 +0100 |
| Subject | float("nan") in set or as key |
| Message-ID | <mailman.2206.1306626083.9059.python-list@python.org> |
Here's a curiosity. float("nan") can occur multiple times in a set or as
a key in a dict:
>>> {float("nan"), float("nan")}
{nan, nan}
except that sometimes it can't:
>>> nan = float("nan")
>>> {nan, nan}
{nan}
[toc] | [next] | [standalone]
| From | Erik Max Francis <max@alcyone.com> |
|---|---|
| Date | 2011-05-28 17:16 -0700 |
| Message-ID | <O6GdnSCKr8XvDXzQnZ2dnUVZ5r6dnZ2d@giganews.com> |
| In reply to | #6498 |
MRAB wrote:
> Here's a curiosity. float("nan") can occur multiple times in a set or as
> a key in a dict:
>
> >>> {float("nan"), float("nan")}
> {nan, nan}
>
> except that sometimes it can't:
>
> >>> nan = float("nan")
> >>> {nan, nan}
> {nan}
It's fundamentally because NaN is not equal to itself, by design.
Dictionaries and sets rely on equality to test for uniqueness of keys or
elements.
>>> nan = float("nan")
>>> nan == nan
False
In short, don't do that.
--
Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
There was never a good war or a bad peace.
-- Benjamin Franklin, 1706-1790
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-05-29 00:26 +0000 |
| Message-ID | <4de1929f$0$29996$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #6498 |
On Sun, 29 May 2011 00:41:16 +0100, MRAB wrote:
> Here's a curiosity. float("nan") can occur multiple times in a set or as
> a key in a dict:
>
> >>> {float("nan"), float("nan")}
> {nan, nan}
That's an implementation detail. Python is free to reuse the same object
when you create an immutable object twice on the same line, but in this
case doesn't. (I don't actually know if it ever does, but it could.)
And since NAN != NAN always, you can get two NANs in the one set, since
they're unequal.
> when you write float('nan')
>
> except that sometimes it can't:
>
> >>> nan = float("nan")
> >>> {nan, nan}
> {nan}
But in this case, you try to put the same NAN in the set twice. Since
sets optimize element testing by checking for identity before equality,
the NAN only goes in once.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2011-05-29 13:04 +1200 |
| Message-ID | <94dkd3F7k4U1@mid.individual.net> |
| In reply to | #6498 |
MRAB wrote:
> float("nan") can occur multiple times in a set or as
> a key in a dict:
>
> >>> {float("nan"), float("nan")}
> {nan, nan}
>
> except that sometimes it can't:
>
> >>> nan = float("nan")
> >>> {nan, nan}
> {nan}
NaNs are weird. They're not equal to themselves:
Python 2.7 (r27:82500, Oct 15 2010, 21:14:33)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> nan = float("nan")
>>> nan == nan
False
This confuses the daylights out of Python's dict lookup machinery,
which assumes that two references to the same object can't possibly
compare unequal, so it doesn't bother calling __eq__ on them.
--
Greg
[toc] | [prev] | [next] | [standalone]
| From | John Nagle <nagle@animats.com> |
|---|---|
| Date | 2011-05-28 23:12 -0700 |
| Message-ID | <4de1e3e7$0$2195$742ec2ed@news.sonic.net> |
| In reply to | #6506 |
On 5/28/2011 6:04 PM, Gregory Ewing wrote:
> MRAB wrote:
>> float("nan") can occur multiple times in a set or as a key in a dict:
>>
>> >>> {float("nan"), float("nan")}
>> {nan, nan}
>>
>> except that sometimes it can't:
>>
>> >>> nan = float("nan")
>> >>> {nan, nan}
>> {nan}
>
> NaNs are weird. They're not equal to themselves:
>
> Python 2.7 (r27:82500, Oct 15 2010, 21:14:33)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> nan = float("nan")
> >>> nan == nan
> False
>
> This confuses the daylights out of Python's dict lookup machinery,
> which assumes that two references to the same object can't possibly
> compare unequal, so it doesn't bother calling __eq__ on them.
Right.
The correct answer to "nan == nan" is to raise an exception, because
you have asked a question for which the answer is nether True nor False.
The correct semantics for IEEE floating point look something like
this:
1/0 INF
INF + 1 INF
INF - INF NaN
INF == INF unordered
NaN == NaN unordered
INF and NaN both have comparison semantics which return
"unordered". The FPU sets a bit for this, which most language
implementations ignore. But you can turn on floating point
exception traps, and on x86 machines, they're exact - the
exception will occur exactly at the instruction which
triggered the error. In superscalar CPUs, a sizable part of
the CPU handles the unwinding necessary to do that. x86 does
it, because it's carefully emulating non-superscalar machines.
Most RISC machines don't bother.
Python should raise an exception on unordered comparisons.
Given that the language handles integer overflow by going to
arbitrary-precision integers, checking the FPU status bits is
cheap.
The advantage of raising an exception is that the logical operations
still work. For example,
not (a == b)
a != b
will always return the same results if exceptions are raised for
unordered comparison results. Also, exactly one of
a = b
a < b
a > b
is always true - something sorts tend to assume.
If you get an unordered comparison exception, your program
almost certainly was getting wrong answers.
(I used to do dynamics simulation engines, where this mattered.)
John Nagle
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-05-29 10:29 +0000 |
| Message-ID | <4de22007$0$29996$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #6514 |
On Sat, 28 May 2011 23:12:54 -0700, John Nagle wrote:
> The correct answer to "nan == nan" is to raise an exception, because
> you have asked a question for which the answer is nether True nor False.
Wrong.
The correct answer to "nan == nan" is False, they are not equal. Just as
None != "none", and 42 != [42], or a teacup is not equal to a box of
hammers.
Asking whether NAN < 0 could arguably either return "unordered" (raise an
exception) or return False ("no, NAN is not less than zero; neither is it
greater than zero"). The PowerPC Macintishes back in the 1990s supported
both behaviours. But that's different to equality tests.
> The correct semantics for IEEE floating point look something like
> this:
>
> 1/0 INF
> INF + 1 INF
> INF - INF NaN
> INF == INF unordered
Wrong. Equality is not an order comparison.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2011-05-29 22:19 +0100 |
| Message-ID | <pan.2011.05.29.21.19.12.375000@nowhere.com> |
| In reply to | #6526 |
On Sun, 29 May 2011 10:29:28 +0000, Steven D'Aprano wrote: >> The correct answer to "nan == nan" is to raise an exception, because >> you have asked a question for which the answer is nether True nor False. > > Wrong. That's overstating it. There's a good argument to be made for raising an exception. Bear in mind that an exception is not necessarily an error, just an "exceptional" condition. > The correct answer to "nan == nan" is False, they are not equal. There is no correct answer to "nan == nan". Defining it to be false is just the "least wrong" answer. Arguably, "nan != nan" should also be false, but that would violate the invariant "(x != y) == !(x == y)".
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-05-29 23:31 +0000 |
| Message-ID | <4de2d746$0$29996$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #6578 |
On Sun, 29 May 2011 22:19:49 +0100, Nobody wrote: > On Sun, 29 May 2011 10:29:28 +0000, Steven D'Aprano wrote: > >>> The correct answer to "nan == nan" is to raise an exception, >>> because >>> you have asked a question for which the answer is nether True nor >>> False. >> >> Wrong. > > That's overstating it. There's a good argument to be made for raising an > exception. If so, I've never heard it, and I cannot imagine what such a good argument would be. Please give it. (I can think of *bad* arguments, like "NANs confuse me and I don't understand the reason for their existence, therefore I'll give them behaviours that make no sense and aren't useful". But you did state there is a *good* argument.) > Bear in mind that an exception is not necessarily an error, > just an "exceptional" condition. True, but what's your point? Testing two floats for equality is not an exceptional condition. >> The correct answer to "nan == nan" is False, they are not equal. > > There is no correct answer to "nan == nan". Why on earth not? > Defining it to be false is just the "least wrong" answer. So you say, but I think you are incorrect. > Arguably, "nan != nan" should also be false, > but that would violate the invariant "(x != y) == !(x == y)". I cannot imagine what that argument would be. Please explain. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2011-06-01 21:41 +0100 |
| Message-ID | <pan.2011.06.01.20.40.18.453000@nowhere.com> |
| In reply to | #6595 |
On Sun, 29 May 2011 23:31:19 +0000, Steven D'Aprano wrote: >> That's overstating it. There's a good argument to be made for raising an >> exception. > > If so, I've never heard it, and I cannot imagine what such a good > argument would be. Please give it. Exceptions allow you to write more natural code by ignoring the awkward cases. E.g. writing "x * y + z" rather than first determining whether "x * y" is even defined then using a conditional. >> Bear in mind that an exception is not necessarily an error, >> just an "exceptional" condition. > > True, but what's your point? Testing two floats for equality is not an > exceptional condition. NaN itself is an exceptional condition which arises when a result is undefined or not representable. When an operation normally returns a number but a specific case cannot do so, it returns not-a-number. The usual semantics for NaNs are practically identical to those for exceptions. If any intermediate result in a floating-point expression is NaN, the overall result is NaN. Similarly, if any intermediate calculation throws an exception, the calculation as a whole throws an exception. If x is NaN, then "x + y" is NaN, "x * y" is NaN, pretty much anything involving x is NaN. By this reasoning both "x == y" and "x != y" should also be NaN. But only the floating-point types have a NaN value, while bool doesn't. However, all types have exceptions. >>> The correct answer to "nan == nan" is False, they are not equal. >> >> There is no correct answer to "nan == nan". > > Why on earth not? Why should there be a correct answer? What does NaN actually mean? Apart from anything else, defining "NaN == NaN" as False means that "x == x" is False if x is NaN, which violates one of the fundamental axioms of an equivalence relation (and, in every other regard, "==" is normally intended to be an equivalence relation). The creation of NaN was a pragmatic decision on how to handle exceptional conditions in hardware. It is not holy writ, and there's no fundamental reason why a high-level language should export the hardware's behaviour verbatim. >> Arguably, "nan != nan" should also be false, >> but that would violate the invariant "(x != y) == !(x == y)". > > I cannot imagine what that argument would be. Please explain. A result of NaN means that the result of the calculation is undefined, so the value is "unknown". If x is unknown and y is unknown, then whether x is equal to y is itself unknown, and whether x differs from y is also unknown.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-06-02 09:54 +0000 |
| Message-ID | <4de75dd5$0$29983$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #6824 |
On Wed, 01 Jun 2011 21:41:06 +0100, Nobody wrote:
> On Sun, 29 May 2011 23:31:19 +0000, Steven D'Aprano wrote:
>
>>> That's overstating it. There's a good argument to be made for raising
>>> an exception.
>>
>> If so, I've never heard it, and I cannot imagine what such a good
>> argument would be. Please give it.
>
> Exceptions allow you to write more natural code by ignoring the awkward
> cases. E.g. writing "x * y + z" rather than first determining whether "x
> * y" is even defined then using a conditional.
You've quoted me out of context. I wasn't asking for justification for
exceptions in general. There's no doubt that they're useful. We were
specifically talking about NAN == NAN raising an exception rather than
returning False.
>>> Bear in mind that an exception is not necessarily an error, just an
>>> "exceptional" condition.
>>
>> True, but what's your point? Testing two floats for equality is not an
>> exceptional condition.
>
> NaN itself is an exceptional condition which arises when a result is
> undefined or not representable. When an operation normally returns a
> number but a specific case cannot do so, it returns not-a-number.
I'm not sure what "not representable" is supposed to mean, but if you
"undefined" you mean "invalid", then correct.
> The usual semantics for NaNs are practically identical to those for
> exceptions. If any intermediate result in a floating-point expression is
> NaN, the overall result is NaN.
Not necessarily. William Kahan gives an example where passing a NAN to
hypot can justifiably return INF instead of NAN. While it's certainly
true that *mostly* any intermediate NAN results in a NAN, that's not a
guarantee or requirement of the standard. A function is allowed to
convert NANs back to non-NANs, if it is appropriate for that function.
Another example is the Kronecker delta:
def kronecker(x, y):
if x == y: return 1
return 0
This will correctly consume NAN arguments. If either x or y is a NAN, it
will return 0.
(As an aside, this demonstrates that having NAN != any NAN, including
itself, is useful, as kronecker(x, x) will return 0 if x is a NAN.)
> Similarly, if any intermediate
> calculation throws an exception, the calculation as a whole throws an
> exception.
This is certainly true... the exception cannot look into the future and
see that it isn't needed because a later calculation cancels it out.
Exceptions, or hardware traps, stop the calculation. NANs allow the
calculation to proceed. Both behaviours are useful, and the standard
allows for both.
> If x is NaN, then "x + y" is NaN, "x * y" is NaN, pretty much anything
> involving x is NaN. By this reasoning both "x == y" and "x != y" should
> also be NaN.
NAN is a sentinel for an invalid operation. NAN + NAN returns a NAN
because it is an invalid operation, not because NANs are magical goop
that spoil everything they touch.
For example, print(NAN) does not return a NAN or raise an exception, nor
is there any need for it to. Slightly more esoteric: the signbit and
copysign functions both accept NANs without necessarily returning NANs.
Equality comparison is another such function. There's no need for
NAN == NAN to fail, because the equality operation is perfectly well
defined for NANs.
> But only the floating-point types have a NaN value, while
> bool doesn't. However, all types have exceptions.
What relevance does bool have?
>>>> The correct answer to "nan == nan" is False, they are not equal.
>>>
>>> There is no correct answer to "nan == nan".
>>
>> Why on earth not?
>
> Why should there be a correct answer? What does NaN actually mean?
NAN means "this is a sentinel marking that an invalid calculation was
attempted". For the purposes of numeric calculation, it is often useful
to allow those sentinels to propagate through your calculation rather
than to halt the program, perhaps because you hope to find that the
invalid marker ends up not being needed and can be ignored, or because
you can't afford to halt the program.
Does INVALID == INVALID? There's no reason to think that the question
itself is an invalid operation. If you can cope with the question "Is an
apple equal to a puppy dog?" without shouting "CANNOT COMPUTE!!!" and
running down the street, there's no reason to treat NAN == NAN as
anything worse.
So what should NAN == NAN equal? Consider the answer to the apple and
puppy dog comparison. Chances are that anyone asked that will give you a
strange look and say "Of course not, you idiot". (In my experience, and
believe it or not I have actually tried this, some people will ask you to
define equality. But they're a distinct minority.)
If you consider "equal to" to mean "the same as", then the answer is
clear and obvious: apples do not equal puppies, and any INVALID sentinel
is not equal to any other INVALID. (Remember, NAN is not a value itself,
it's a sentinel representing the fact that you don't have a valid number.)
So NAN == NAN should return False, just like the standard states, and
NAN != NAN should return True. "No, of course not, they're not equal."
> Apart from anything else, defining "NaN == NaN" as False means that "x
> == x" is False if x is NaN, which violates one of the fundamental axioms
> of an equivalence relation (and, in every other regard, "==" is normally
> intended to be an equivalence relation).
Yes, that's a consequence of NAN behaviour. I can live with that.
> The creation of NaN was a pragmatic decision on how to handle
> exceptional conditions in hardware. It is not holy writ, and there's no
> fundamental reason why a high-level language should export the
> hardware's behaviour verbatim.
There is a good, solid reason: it's a *useful* standard that *works*,
proven in practice, invented by people who have forgotten more about
floating point than you or I will ever learn, and we dismiss their
conclusions at our peril.
A less good reason: its a standard. Better to stick to a not-very-good
standard than to have the Wild West, where everyone chooses their own
behaviour. You have NAN == NAN raise ValueError, Fred has it return True,
George has it return False, Susan has it return a NAN, Michelle makes it
raise MathError, somebody else returns Maybe ...
But IEEE-754 is not just a "not-very-good" standard. It is an extremely
good standard.
>>> Arguably, "nan != nan" should also be false, but that would violate
>>> the invariant "(x != y) == !(x == y)".
>>
>> I cannot imagine what that argument would be. Please explain.
>
> A result of NaN means that the result of the calculation is undefined,
> so the value is "unknown".
Incorrect. NANs are not "unknowns", or missing values.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2011-06-02 13:05 +0000 |
| Message-ID | <is81ri$9rt$1@reader1.panix.com> |
| In reply to | #6848 |
On 2011-06-02, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> But IEEE-754 is not just a "not-very-good" standard. It is an
> extremely good standard.
I get the distinct impression that the people arguing that IEEE-754 is
somehow "wrong" about the value of 'NaN == NaN' are the people who
don't actually use floating point. Those of us that do use floating
point and depend on the predictable behavior of NaNs seem to be happy
enough with the standard.
Two of my perennial complaints about Python's handling of NaNs and
Infs:
1) They weren't handle by pickle et al.
2) The string representations produced by repr() and accepted by
float() weren't standardized across platforms.
I think the latter has finally been fixed, hasn't it?
--
Grant Edwards grant.b.edwards Yow! Remember, in 2039,
at MOUSSE & PASTA will
gmail.com be available ONLY by
prescription!!
[toc] | [prev] | [next] | [standalone]
| From | Robert Kern <robert.kern@gmail.com> |
|---|---|
| Date | 2011-06-02 12:04 -0500 |
| Message-ID | <mailman.2392.1307034253.9059.python-list@python.org> |
| In reply to | #6861 |
On 6/2/11 8:05 AM, Grant Edwards wrote: > Two of my perennial complaints about Python's handling of NaNs and > Infs: > > 1) They weren't handle by pickle et al. > > 2) The string representations produced by repr() and accepted by > float() weren't standardized across platforms. > > I think the latter has finally been fixed, hasn't it? And the former! Python 2.7.1 |EPD 7.0-2 (32-bit)| (r271:86832, Dec 3 2010, 15:41:32) [GCC 4.0.1 (Apple Inc. build 5488)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> inf = 1e300*1e300 >>> nan = inf / inf >>> import cPickle >>> cPickle.loads(cPickle.dumps(nan)) nan >>> cPickle.loads(cPickle.dumps(inf)) inf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2011-06-02 21:47 +0100 |
| Message-ID | <pan.2011.06.02.20.47.01.344000@nowhere.com> |
| In reply to | #6848 |
On Thu, 02 Jun 2011 09:54:30 +0000, Steven D'Aprano wrote:
>> Exceptions allow you to write more natural code by ignoring the awkward
>> cases. E.g. writing "x * y + z" rather than first determining whether "x
>> * y" is even defined then using a conditional.
>
> You've quoted me out of context. I wasn't asking for justification for
> exceptions in general. There's no doubt that they're useful. We were
> specifically talking about NAN == NAN raising an exception rather than
> returning False.
It's arguable that NaN itself simply shouldn't exist in Python; if the FPU
ever generates a NaN, Python should raise an exception at that point.
But given that NaNs propagate in almost the same manner as exceptions,
you could "optimise" this by treating a NaN as a special-case
implementation of exceptions, and turn it into a real exception at the
point where you can no longer use a NaN (e.g. when using a comparison
operator).
This would produce the same end result as raising an exception
immediately, but would reduce the number of isnan() tests.
>> NaN itself is an exceptional condition which arises when a result is
>> undefined or not representable. When an operation normally returns a
>> number but a specific case cannot do so, it returns not-a-number.
>
> I'm not sure what "not representable" is supposed to mean,
Consider sqrt(-1). This is defined (as "i" aka "j"), but not representable
as a floating-point "real". Making root/log/trig/etc functions return
complex numbers when necessary probably be inappropriate for a language
such as Python.
> but if you "undefined" you mean "invalid", then correct.
I mean undefined, in the sense that 0/0 is undefined (I note that Python
actually raises an exception for "0.0/0.0").
>> The usual semantics for NaNs are practically identical to those for
>> exceptions. If any intermediate result in a floating-point expression is
>> NaN, the overall result is NaN.
>
> Not necessarily. William Kahan gives an example where passing a NAN to
> hypot can justifiably return INF instead of NAN.
Hmm. Is that still true if the NaN signifies "not representable" (e.g.
known but complex) rather than undefined (e.g. unknown value but known to
be real)?
> While it's certainly
> true that *mostly* any intermediate NAN results in a NAN, that's not a
> guarantee or requirement of the standard. A function is allowed to
> convert NANs back to non-NANs, if it is appropriate for that function.
>
> Another example is the Kronecker delta:
>
> def kronecker(x, y):
> if x == y: return 1
> return 0
>
> This will correctly consume NAN arguments. If either x or y is a NAN, it
> will return 0. (As an aside, this demonstrates that having NAN != any
> NAN, including itself, is useful, as kronecker(x, x) will return 0 if x
> is a NAN.)
How is this useful? On the contrary, I'd suggest that the fact that
kronecker(x, x) can return 0 is an argument against the "NaN != NaN" axiom.
A case where the semantics of exceptions differ from those of NaN is:
def cond(t, x, y):
if t:
return x
else:
return y
as cond(True, x, nan()) will return x, while cond(True, x, raise()) will
raise an exception.
But this is a specific instance of a more general problem with strict
languages, i.e. strict functions violate referential transparency.
This is why even strict languages (i.e. almost everything except for a
handful of functional languages which value mathematical purity, e.g.
Haskell) have non-strict conditionals. If you remove the conditional from
the function and write it in-line, then:
if True:
return x
else:
raise()
behaves like NaN.
Also, note that the "convenience" of NaN (e.g. not propagating from the
untaken branch of a conditional) is only available for floating-point
types. If it's such a good idea, why don't we have it for other types?
> Equality comparison is another such function. There's no need for
> NAN == NAN to fail, because the equality operation is perfectly well
> defined for NANs.
The definition is entirely arbitrary. You could just as easily define that
(NaN == NaN) is True. You could just as easily define that "1 + NaN" is 27.
Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the former
upholds the equivalence axioms and is consistent with the normal behaviour
of "is" (i.e. "x is y" => "x == y", even if the converse isn't necessarily
true).
If you're going to argue that "NaN == NaN" should be False on the basis
that the values are sentinels for unrepresentable values (which may be
*different* unrepresentable values), it follows that "NaN != NaN" should
also be False for the same reason.
>> But only the floating-point types have a NaN value, while
>> bool doesn't. However, all types have exceptions.
>
> What relevance does bool have?
The result of comparisons is a bool.
>> Why should there be a correct answer? What does NaN actually mean?
>
> NAN means "this is a sentinel marking that an invalid calculation was
> attempted". For the purposes of numeric calculation, it is often useful
> to allow those sentinels to propagate through your calculation rather
> than to halt the program, perhaps because you hope to find that the
> invalid marker ends up not being needed and can be ignored, or because
> you can't afford to halt the program.
>
> Does INVALID == INVALID?
Either True or INVALID. You can make a reasonable argument for either.
Making a reasonable argument that it should be False is much harder.
> If you can cope with the question "Is an apple equal to a puppy dog?"
It depends upon your definition of equality, but it's not a particularly
hard question. And completely irrelevant here.
> So what should NAN == NAN equal? Consider the answer to the apple and
> puppy dog comparison. Chances are that anyone asked that will give you a
> strange look and say "Of course not, you idiot". (In my experience, and
> believe it or not I have actually tried this, some people will ask you to
> define equality. But they're a distinct minority.)
>
> If you consider "equal to" to mean "the same as", then the answer is
> clear and obvious: apples do not equal puppies,
This is "equality" as opposed to "equivalence", i.e. x and y are equal if
and only if f(x) and f(y) are equal for all f.
> and any INVALID sentinel is not equal to any other INVALID.
This does not follow. Unless you explicity define the sentinel to be
unequal to itself, the strict equality definition holds, as NaN tends to
be a specific bit pattern (multiple bit patterns are interpreted as NaN,
but operations which result in a NaN will use a specific pattern, possibly
modulo the sign bit).
If you want to argue that "NaN == NaN" should be False, then do so. Simply
asserting that it should be False won't suffice (nor will citing the IEEE
FP standard *unless* you're arguing that "because the standard says so" is
the only reason required).
> (Remember, NAN is not a value itself, it's a sentinel representing the
> fact that you don't have a valid number.)
i'm aware of that.
> So NAN == NAN should return False,
Why?
> just like the standard states, and NAN != NAN should return True.
Why?
In both cases, the more obvious result should be some kind of sentinel
indicating that we don't have a valid boolean. Why should this sentinel
propagate through arithmetic operations but not through logical operations?
>> Apart from anything else, defining "NaN == NaN" as False means that "x
>> == x" is False if x is NaN, which violates one of the fundamental axioms
>> of an equivalence relation (and, in every other regard, "==" is normally
>> intended to be an equivalence relation).
>
> Yes, that's a consequence of NAN behaviour.
Another consequence:
> x = float("nan")
> x is x
True
> x == x
False
Ordinarily, you would consider this behaviour a bug in the class' __eq__
method.
> I can live with that.
I can *live* with it (not that I have much choice), but that doesn't meant
that it's correct or even anything short of downright stupid.
>> The creation of NaN was a pragmatic decision on how to handle
>> exceptional conditions in hardware. It is not holy writ, and there's no
>> fundamental reason why a high-level language should export the
>> hardware's behaviour verbatim.
>
> There is a good, solid reason: it's a *useful* standard
Debatable.
> that *works*,
Debatable.
> proven in practice,
If anything, it has proven to be a major nuisance. It takes a lot of
effort to create (or even specify) code which does the right thing in the
presence of NaNs.
Turning NaNs into exceptions at their source wouldn't make it
significantly harder to write correct code (there are a handful of cases
where the existing behaviour produces the right answer almost by accident,
far more where it doesn't), and would mean that "simple" code (where NaN
hasn't been explicitly considered) raises an exception rather than
silently producing a wrong answer.
> invented by people who have forgotten more about
> floating point than you or I will ever learn, and we dismiss their
> conclusions at our peril.
I'm not aware that they made any conclusions about Python. I don't
consider any conclusions about the most appropriate behaviour for hardware
(which may have no choice beyond exactly /which/ bit pattern to put into a
register) to automatically determine what is the most appropriate
behaviour for a high-level language.
> A less good reason: its a standard. Better to stick to a not-very-good
> standard than to have the Wild West, where everyone chooses their own
> behaviour. You have NAN == NAN raise ValueError, Fred has it return True,
> George has it return False, Susan has it return a NAN, Michelle makes it
> raise MathError, somebody else returns Maybe ...
This isn't an issue if you have the language deal with it.
>> A result of NaN means that the result of the calculation is undefined,
>> so the value is "unknown".
>
> Incorrect. NANs are not "unknowns", or missing values.
You're contradicting yourself here.
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2011-06-03 14:52 +0000 |
| Message-ID | <isasfm$inl$1@reader1.panix.com> |
| In reply to | #6884 |
On 2011-06-02, Nobody <nobody@nowhere.com> wrote: > On Thu, 02 Jun 2011 09:54:30 +0000, Steven D'Aprano wrote: > >>> Exceptions allow you to write more natural code by ignoring the >>> awkward cases. E.g. writing "x * y + z" rather than first determining >>> whether "x * y" is even defined then using a conditional. >> >> You've quoted me out of context. I wasn't asking for justification >> for exceptions in general. There's no doubt that they're useful. We >> were specifically talking about NAN == NAN raising an exception >> rather than returning False. > > It's arguable that NaN itself simply shouldn't exist in Python; if > the FPU ever generates a NaN, Python should raise an exception at > that point. Sorry, I just don't "get" that argument. I depend on compliance with IEEE-754, and I find the current NaN behavior very useful, and labor-saving. > But given that NaNs propagate in almost the same manner as > exceptions, you could "optimise" this by treating a NaN as a > special-case implementation of exceptions, and turn it into a real > exception at the point where you can no longer use a NaN (e.g. when > using a comparison operator). > > This would produce the same end result as raising an exception > immediately, but would reduce the number of isnan() tests. I've never found the number of isnan() checks in my code to be an issue -- there just arent that many of them, and when they are there, it provides an easy to read and easy to maintain way to handle things. > I mean undefined, in the sense that 0/0 is undefined But 0.0/0.0 _is_ defined. It's NaN. ;) > (I note that Python actually raises an exception for "0.0/0.0"). IMHO, that's a bug. IEEE-754 states explicit that 0.0/0.0 is NaN. Pythons claims it implements IEEE-754. Python got it wrong. > Also, note that the "convenience" of NaN (e.g. not propagating from > the untaken branch of a conditional) is only available for > floating-point types. If it's such a good idea, why don't we have it > for other types? > The definition is entirely arbitrary. I don't agree, but even if was entirely arbitrary, that doesn't make the decision meaningless. IEEE-754 says it's True, and standards compliance is valuable. Each country's decision to drive on the right/left side of the road is entire arbitrary, but once decided there's a huge benefit to everybody following the rule. > You could just as easily define that (NaN == NaN) is True. You could > just as easily define that "1 + NaN" is 27. I don't think that would be "just as easy" to use. > Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the > former upholds the equivalence axioms You seem to be talking about reals. We're talking about floats. > If you're going to argue that "NaN == NaN" should be False on the > basis that the values are sentinels for unrepresentable values (which > may be *different* unrepresentable values), it follows that "NaN != > NaN" should also be False for the same reason. Mostly I just want Python to follow the IEEE-754 standard [which I happen to find to be very well thought out and almost always behaves in a practical, useful manner]. > If you want to argue that "NaN == NaN" should be False, then do so. > Simply asserting that it should be False won't suffice (nor will > citing the IEEE FP standard *unless* you're arguing that "because the > standard says so" is the only reason required). For those of us who have to accomplish real work and interface with real devices "because the standard says so" is actaully a darned good reason. Years of experience has also shown to me that it's a very practical decision. > If anything, it has proven to be a major nuisance. It takes a lot of > effort to create (or even specify) code which does the right thing in > the presence of NaNs. That's not been my experience. NaNs save a _huge_ amount of effort compared to having to pass value+status info around throughout complex caclulations. > I'm not aware that they made any conclusions about Python. They made some very informed (and IMO valid) conclusions about scientific computing using binary floating point arithmatic. Those conclusions apply largly to Python. -- Grant
[toc] | [prev] | [next] | [standalone]
| From | Chris Torek <nospam@torek.net> |
|---|---|
| Date | 2011-06-03 17:52 +0000 |
| Message-ID | <isb70o054@news5.newsguy.com> |
| In reply to | #6947 |
>On 2011-06-02, Nobody <nobody@nowhere.com> wrote:
>> (I note that Python actually raises an exception for "0.0/0.0").
In article <isasfm$inl$1@reader1.panix.com>
Grant Edwards <invalid@invalid.invalid> wrote:
>IMHO, that's a bug. IEEE-754 states explicit that 0.0/0.0 is NaN.
>Pythons claims it implements IEEE-754. Python got it wrong.
Indeed -- or at least, inconsistent. (Again I would not mind at
all if Python had "raise exception on NaN-result" mode *as well
as* "quietly make NaN", perhaps using signalling vs quiet NaN to
tell them apart in most cases, plus some sort of floating-point
context control, for instance.)
>> Also, note that the "convenience" of NaN (e.g. not propagating from
>> the untaken branch of a conditional) is only available for
>> floating-point types. If it's such a good idea, why don't we have it
>> for other types?
Mostly because for integers it's "too late" and there is no standard
for it. For others, well:
>>> import decimal
>>> decimal.Decimal('nan')
Decimal("NaN")
>>> _ + 1
Decimal("NaN")
>>> decimal.setcontext(decimal.ExtendedContext)
>>> print decimal.Decimal(1) / 0
Infinity
>>> [etc]
(Note that you have to set the decimal context to one that does
not produce a zero-divide exception, such as the pre-loaded
decimal.ExtendedContext. On my one Python 2.7 system -- all the
rest are earlier versions, with 2.5 the highest I can count on,
and that only by upgrading it on the really old work systems --
I note that fractions.Fraction(0,0) raises a ZeroDivisionError,
and there is no fractions.ExtendedContext or similar.)
>> The definition is entirely arbitrary.
>
>I don't agree, but even if was entirely arbitrary, that doesn't make
>the decision meaningless. IEEE-754 says it's True, and standards
>compliance is valuable. Each country's decision to drive on the
>right/left side of the road is entire arbitrary, but once decided
>there's a huge benefit to everybody following the rule.
This analogy perhaps works better than expected. Whenever I swap
between Oz or NZ and the US-of-A, I have a brief mental clash that,
if I am not careful, could result in various bad things. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2011-06-06 13:54 +0000 |
| Message-ID | <isim6k$1ic$1@reader1.panix.com> |
| In reply to | #6957 |
On 2011-06-03, Chris Torek <nospam@torek.net> wrote:
>>> The definition is entirely arbitrary.
>>
>>I don't agree, but even if was entirely arbitrary, that doesn't make
>>the decision meaningless. IEEE-754 says it's True, and standards
>>compliance is valuable. Each country's decision to drive on the
>>right/left side of the road is entire arbitrary, but once decided
>>there's a huge benefit to everybody following the rule.
>
> This analogy perhaps works better than expected. Whenever I swap
> between Oz or NZ and the US-of-A, I have a brief mental clash that,
> if I am not careful, could result in various bad things. :-)
I find that I do mostly OK driving "on the wrong side of the road"
[except for the constant windshield/turn-signal mixups], but I have a
horrible time as a pedestrian.
--
Grant Edwards grant.b.edwards Yow! I had a lease on an
at OEDIPUS COMPLEX back in
gmail.com '81 ...
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2011-06-04 00:29 +0100 |
| Message-ID | <pan.2011.06.03.23.29.04.984000@nowhere.com> |
| In reply to | #6947 |
On Fri, 03 Jun 2011 14:52:39 +0000, Grant Edwards wrote: >> It's arguable that NaN itself simply shouldn't exist in Python; if >> the FPU ever generates a NaN, Python should raise an exception at >> that point. > > Sorry, I just don't "get" that argument. I depend on compliance with > IEEE-754, and I find the current NaN behavior very useful, and > labor-saving. If you're "fluent" in IEEE-754, then you won't find its behaviour unexpected. OTOH, if you are approach the issue without preconceptions, you're likely to notice that you effectively have one exception mechanism for floating-point and another for everything else. >> But given that NaNs propagate in almost the same manner as >> exceptions, you could "optimise" this by treating a NaN as a >> special-case implementation of exceptions, and turn it into a real >> exception at the point where you can no longer use a NaN (e.g. when >> using a comparison operator). >> >> This would produce the same end result as raising an exception >> immediately, but would reduce the number of isnan() tests. > > I've never found the number of isnan() checks in my code to be an > issue -- there just arent that many of them, and when they are there, > it provides an easy to read and easy to maintain way to handle things. I think that you misunderstood. What I was saying here was that, if you wanted exception-on-NaN behaviour from Python, the interpreter wouldn't need to call isnan() on every value received from the FPU, but rely upon NaN-propagation and only call it at places where a NaN might disappear (e.g. comparisons). >> I mean undefined, in the sense that 0/0 is undefined > > But 0.0/0.0 _is_ defined. It's NaN. ;) Mathematically, it's undefined. >> (I note that Python actually raises an exception for "0.0/0.0"). > > IMHO, that's a bug. IEEE-754 states explicit that 0.0/0.0 is NaN. > Pythons claims it implements IEEE-754. Python got it wrong. But then IEEE-754 considers integers and floats to be completely different beasts, while Python makes some effort to maintain a unified "numeric" interface. If you really want IEEE-754 to-the-letter, that's undesirable, although I'd question the choice of Python in such situations. >> The definition is entirely arbitrary. > > I don't agree, but even if was entirely arbitrary, that doesn't make > the decision meaningless. IEEE-754 says it's True, and standards > compliance is valuable. True, but so are other things. People with a background in mathematics (as opposed to arithmetic and numerical methods) would probably consider following the equivalence axioms to be valuable. Someone more used to Python than IEEE-754 might consider following the "x is y => x == y" axiom to be valuable. As for IEEE-754 saying that it's True: they only really had two choices: either it's True or it's False. NaNs provide "exceptions" even if the hardware or the language lacks them, but that falls down once you leave the scope of floating-point. It wouldn't have been within IEEE-754's ambit to declare that comparing NaNs should return NaB (Not A Boolean). >> Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the >> former upholds the equivalence axioms > > You seem to be talking about reals. We're talking about floats. Floats are supposed to approximate reals. They're also a Python data type, and should make some effort to fit in with the rest of the language. >> If anything, it has proven to be a major nuisance. It takes a lot of >> effort to create (or even specify) code which does the right thing in >> the presence of NaNs. > > That's not been my experience. NaNs save a _huge_ amount of effort > compared to having to pass value+status info around throughout complex > caclulations. That's what exceptions are for. NaNs probably save a huge amount of effort in languages which lack exceptions, but that isn't applicable to Python. In Python, they result in floats not "fitting in". Let's remember that the thread started with an oddity relating to using floats as dictionary keys, which mostly works but fails for NaN because of the (highly unusual) property that "x == x" is False for NaNs. Why did the Python developers choose this behaviour? It's quite likely that they didn't choose it, but just overlooked the fact that NaN creates this corner-case which breaks code which works for every other primitive type except floats and even every other float except NaN. In any case, I should probably re-iterate at this point that I'm not actually arguing *for* exception-on-NaN or NaN==NaN or similar, just pointing out that IEEE-754 is not the One True Approach and that other approaches are not necessarily heresy and may have some merit. To go back to the point where I entered this thread: >>> The correct answer to "nan == nan" is to raise an exception, >>> because you have asked a question for which the answer is nether True >>> nor False. >> >> Wrong. > > That's overstating it.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2011-06-04 09:51 +1000 |
| Message-ID | <mailman.2444.1307145069.9059.python-list@python.org> |
| In reply to | #6981 |
On Sat, Jun 4, 2011 at 9:29 AM, Nobody <nobody@nowhere.com> wrote: > Floats are supposed to approximate reals. They're also a Python > data type, and should make some effort to fit in with the rest of > the language. > That's what I thought a week ago. But that's not really true. Floats are supposed to hold non-integral values, but the data type is "IEEE 754 floating point", not "real number". There's several ways to store real numbers, and not one of them is (a) perfectly accurate, or (b) plausibly fast to calculate. Using rationals (fractions) with infinite range leads to exponential performance costs, and still doesn't properly handle irrationals like pi. And if you cap the denominator to a power of 2 and cap the length of the mantissa, err I mean numerator, then you have IEEE 754 floating point. Python offers you a way to store and manipulate floating point numbers, not real numbers. Chris Angelico
[toc] | [prev] | [next] | [standalone]
| From | rusi <rustompmody@gmail.com> |
|---|---|
| Date | 2011-06-04 00:52 -0700 |
| Message-ID | <2350f7e6-474f-455d-8c97-6a758de63179@p9g2000prh.googlegroups.com> |
| In reply to | #6981 |
On Jun 4, 4:29 am, Nobody <nob...@nowhere.com> wrote: > On Fri, 03 Jun 2011 14:52:39 +0000, Grant Edwards wrote: > >> It's arguable that NaN itself simply shouldn't exist in Python; if > >> the FPU ever generates a NaN, Python should raise an exception at > >> that point. > > > If you're "fluent" in IEEE-754, then you won't find its behaviour > unexpected. OTOH, if you are approach the issue without preconceptions, > you're likely to notice that you effectively have one exception mechanism > for floating-point and another for everything else. Three actually: None, nan and exceptions Furthermore in boolean contexts nan behaves like True whereas None behaves like false.
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2011-06-04 20:29 +0100 |
| Message-ID | <pan.2011.06.04.19.29.44.828000@nowhere.com> |
| In reply to | #7001 |
On Sat, 04 Jun 2011 00:52:17 -0700, rusi wrote: >> If you're "fluent" in IEEE-754, then you won't find its behaviour >> unexpected. OTOH, if you are approach the issue without preconceptions, >> you're likely to notice that you effectively have one exception mechanism >> for floating-point and another for everything else. > > Three actually: None, nan and exceptions None isn't really an exception; at least, it shouldn't be used like that. Exceptions are for conditions which are in some sense "exceptional". Cases like dict.get() returning None when the key isn't found are meant for the situation where the key not existing is unexceptional. If you "expect" the key to exist, you'd use dict[key] instead (and get an exception if it doesn't).
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web