Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <508ad937$0$29967$c3e8da3$5496439d@news.astraweb.com>
References: <bd80bfd0-b423-418f-a338-fea626d50093@googlegroups.com> <k6cr4c$4md$1@ger.gmane.org> <5089F33A.8010804@mrabarnett.plus.com> <mailman.2886.1351238424.27098.python-list@python.org> <508ab917$0$29967$c3e8da3$5496439d@news.astraweb.com> <mailman.2898.1351269949.27098.python-list@python.org> <508ad937$0$29967$c3e8da3$5496439d@news.astraweb.com>
From: Devin Jeanpierre <jeanpierreda@gmail.com>
Date: Fri, 26 Oct 2012 15:17:37 -0400
Subject: Re: a.index(float('nan')) fails
To: "Steven D'Aprano" <steve+comp.lang.python@pearwood.info>
Content-Type: text/plain; charset=UTF-8
Cc: python-list@python.org
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2904.1351279099.27098.python-list@python.org>
Lines: 57
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:32233

On Fri, Oct 26, 2012 at 2:40 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>> The problem isn't with the associativity, it's with the equality
>> comparison. Replace "x == y" with "abs(x-y)<epsilon" for some epsilon
>> and all your statements fulfill people's expectations.
>
> O RYLY?
>
> Would you care to tell us which epsilon they should use?

I would assume some epsilon that bounds the error of their
computation. Which one to use would depend on the error propagation
their function incurs.

That said, I also disagree with the sentiment "all your statements
fulfill people's expectations". Comparing to be within some epsilon of
each other may mean that some things that are the result of
mathematically unequal expressions, will be called equal because they
are very close to each other by accident. Unless perhaps completely
tight bounds on error can be achieved? I've never seen anyone do this,
but maybe it's reasonable.

> Hint: *whatever* epsilon you pick, there will be cases where that is
> either stupidly too small, stupidly too large, or one that degenerates to
> float equality. And you may not be able to tell if you have one of those
> cases or not.
>
> Here's a concrete example for you:
>
> What *single* value of epsilon should you pick such that the following
> two expressions evaluate correctly?
>
> sum([1e20, 0.1, -1e20, 0.1]*1000) == 200
> sum([1e20, 99.9, -1e20, 0.1]*1000) != 200

Some computations have unbounded error, such as computations where
catastrophic cancellation can occur. That doesn't mean all
computations do. For many computations, you can find a single epsilon
that will always return True for things that "should" be equal, but
aren't -- for example, squaring a number does no worse than tripling
the relative error, so if you square a number that was accurate to
within machine epsilon, and want to compare it to a constant, you can
compare with relative epsilon = 3*machine_epsilon.

I'm not sure how commonly this occurs in real life, because I'm not a
numerical programmer. All I know is that your example is good, but
shows a not-universally-applicable problem.

It is, however, still pretty applicable and worth noting, so I'm not
unhappy you did. For example, how large can the absolute error of the
sin function applied to a float be? Answer: as large as 2, and the
relative error can be arbitrarily large. (Reason: error scales with
the input, but the frequency of the sin function does not.)

(In case you can't tell, I only have studied this stuff as a student. :P)

-- Devin