Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #18937 > unrolled thread
| Started by | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| First post | 2012-01-13 11:04 -0800 |
| Last post | 2012-01-16 11:07 +0000 |
| Articles | 6 — 6 participants |
Back to article view | Back to comp.lang.python
NaN, Null, and Sorting Ethan Furman <ethan@stoneleaf.us> - 2012-01-13 11:04 -0800
Re: NaN, Null, and Sorting Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-01-14 04:54 +0000
Re: NaN, Null, and Sorting jmfauth <wxjmfauth@gmail.com> - 2012-01-13 23:43 -0800
Re: NaN, Null, and Sorting Eelco <hoogendoorn.eelco@gmail.com> - 2012-01-16 02:22 -0800
Re: NaN, Null, and Sorting Chris Angelico <rosuav@gmail.com> - 2012-01-16 21:57 +1100
Re: NaN, Null, and Sorting Robert Kern <robert.kern@gmail.com> - 2012-01-16 11:07 +0000
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2012-01-13 11:04 -0800 |
| Subject | NaN, Null, and Sorting |
| Message-ID | <mailman.4725.1326484286.27778.python-list@python.org> |
With NaN, it is possible to get a list that will not properly sort:
--> NaN = float('nan')
--> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN]
--> sorted(spam)
[1, 2, nan, 3, nan, 4, 5, 7, nan]
I'm constructing a Null object with the semantics that if the returned
object is Null, it's actual value is unknown.
From a purist point of view if it is unknown then comparison results
are also unknown since the actual value might be greater, lesser, or the
same as the value being compared against.
From a practical point of view a list with Nulls scattered throughout
is a pain in the backside.
So I am strongly leaning towards implementing the comparisons such that
Null objects are less than other objects so they will always sort together.
Thoughts/advice/criticisms/etc?
~Ethan~
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-01-14 04:54 +0000 |
| Message-ID | <4f110a95$0$29988$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #18937 |
On Fri, 13 Jan 2012 11:04:48 -0800, Ethan Furman wrote:
> With NaN, it is possible to get a list that will not properly sort:
>
> --> NaN = float('nan')
> --> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN] --> sorted(spam)
> [1, 2, nan, 3, nan, 4, 5, 7, nan]
>
> I'm constructing a Null object with the semantics that if the returned
> object is Null, it's actual value is unknown.
>
> From a purist point of view if it is unknown then comparison results
> are also unknown since the actual value might be greater, lesser, or the
> same as the value being compared against.
From a purist point of view, NANs are unordered with respect to numbers,
and so one of two behaviours should occur:
(1) nan OP x should raise an exception, for all comparison operators
except == and !=
(2) nan OP x should return False for all OPs except !=
I believe the current version of the standard supports operators for both
sets of behaviour; the 1990s version of Apple's numeric framework (SANE)
included both.
I think Python chooses the second behaviour, although it may be version
and platform dependent. This is from Python 2.6:
>>> float('nan') < 0
False
>>> float('nan') > 0
False
I would expect the same behaviour for your Null objects. But as you say:
> From a practical point of view a list with Nulls scattered throughout
> is a pain in the backside.
And this is why sorting should be defined in terms of a separate sorting
operator, not < or >, so that lists containing unordered values like NANs,
Nulls, and complex numbers, can be sorted. "Sorted" is a property of the
list, not the values within the list.
> So I am strongly leaning towards implementing the comparisons such that
> Null objects are less than other objects so they will always sort
> together.
>
> Thoughts/advice/criticisms/etc?
Possibly the least-worst solution.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | jmfauth <wxjmfauth@gmail.com> |
|---|---|
| Date | 2012-01-13 23:43 -0800 |
| Message-ID | <3b5278a0-9612-4c34-a743-df463f5bedc4@w4g2000vbc.googlegroups.com> |
| In reply to | #18937 |
On 13 jan, 20:04, Ethan Furman <et...@stoneleaf.us> wrote:
> With NaN, it is possible to get a list that will not properly sort:
>
> --> NaN = float('nan')
> --> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN]
> --> sorted(spam)
> [1, 2, nan, 3, nan, 4, 5, 7, nan]
>
> I'm constructing a Null object with the semantics that if the returned
> object is Null, it's actual value is unknown.
>
Short answer.
- NaN != NA()
- I find the actual implementation (Py3.2) quite satisfying. (M.
Dickinson's work)
jmf
[toc] | [prev] | [next] | [standalone]
| From | Eelco <hoogendoorn.eelco@gmail.com> |
|---|---|
| Date | 2012-01-16 02:22 -0800 |
| Message-ID | <67bd5e6b-a332-4d13-aad3-8de88b218ac5@z19g2000vbe.googlegroups.com> |
| In reply to | #18937 |
On 13 jan, 20:04, Ethan Furman <et...@stoneleaf.us> wrote:
> With NaN, it is possible to get a list that will not properly sort:
>
> --> NaN = float('nan')
> --> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN]
> --> sorted(spam)
> [1, 2, nan, 3, nan, 4, 5, 7, nan]
>
> I'm constructing a Null object with the semantics that if the returned
> object is Null, it's actual value is unknown.
>
> From a purist point of view if it is unknown then comparison results
> are also unknown since the actual value might be greater, lesser, or the
> same as the value being compared against.
>
> From a practical point of view a list with Nulls scattered throughout
> is a pain in the backside.
>
> So I am strongly leaning towards implementing the comparisons such that
> Null objects are less than other objects so they will always sort together.
>
> Thoughts/advice/criticisms/etc?
>
> ~Ethan~
My suggestion would be thus: nans/nulls are unordered; sorting them is
fundamentally an ill defined notion. What you want, conceptually, is a
sorted list of the sortable entries, and a seperate list of the
unsorted entries. Translated into code, the most pure solution would
be to filter out the nanas/nulls in their own list first, and then
sort the rest. If the interface demands it, you can concatenate the
lists afterwards, but probably it is most convenient to keep them in
seperate lists.
Perhaps arbitrarily defining the ordering of nulls/nans is slightly
more efficient than the above, but it should not make a big
difference, and in terms of purity its no contest.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-01-16 21:57 +1100 |
| Message-ID | <mailman.4790.1326711481.27778.python-list@python.org> |
| In reply to | #19036 |
On Mon, Jan 16, 2012 at 9:22 PM, Eelco <hoogendoorn.eelco@gmail.com> wrote: > What you want, conceptually, is a > sorted list of the sortable entries, and a seperate list of the > unsorted entries. Translated into code, the most pure solution would > be to filter out the nanas/nulls in their own list first, and then > sort the rest. If the interface demands it, you can concatenate the > lists afterwards, but probably it is most convenient to keep them in > seperate lists. So... you split it into two lists, sort the two lists (one of which can't be sorted), and then concatenate them. Sounds like the quicksort algorithm. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Robert Kern <robert.kern@gmail.com> |
|---|---|
| Date | 2012-01-16 11:07 +0000 |
| Message-ID | <mailman.4791.1326712065.27778.python-list@python.org> |
| In reply to | #19036 |
On 1/16/12 10:57 AM, Chris Angelico wrote: > On Mon, Jan 16, 2012 at 9:22 PM, Eelco<hoogendoorn.eelco@gmail.com> wrote: >> What you want, conceptually, is a >> sorted list of the sortable entries, and a seperate list of the >> unsorted entries. Translated into code, the most pure solution would >> be to filter out the nanas/nulls in their own list first, and then >> sort the rest. If the interface demands it, you can concatenate the >> lists afterwards, but probably it is most convenient to keep them in >> seperate lists. > > So... you split it into two lists, sort the two lists (one of which > can't be sorted), and then concatenate them. Sounds like the quicksort > algorithm. Not at all. The "split it into two lists" steps are entirely different in what Eelco suggested and quicksort. It's misleading to attempt to describe both using the same words. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web