Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #32529
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Subject | Re: sort order for strings of digits |
| Date | 2012-10-31 14:17 -0400 |
| Organization | > Bestiaria Support Staff < |
| References | <k6rfdu$rgn$1@news.albasani.net> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3124.1351707476.27098.python-list@python.org> (permalink) |
On Wed, 31 Oct 2012 15:17:14 +0000, djc <djc@kangoo.invalid> declaimed
the following in gmane.comp.python.general:
>
> TODO 2012-10-22: sort order numbers first then alphanumeric
> >>> n
> ('1', '10', '101', '3', '40', '31', '13', '2', '2000')
> >>> s
> ('a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4')
>
> >>> sorted(n)
> ['1', '10', '101', '13', '2', '2000', '3', '31', '40']
> >>> sorted(s)
> ['1a', '222 bb', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
> >>> sorted(n+s)
> ['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40',
> 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>
Both your subject line, and the above samples are not "sorting
'numbers'"... They are sorting STRINGS that contain values representing
the printable glyphs of decimal digits.
The above should work in Python 3 as all the data appears as strings
(probably Unicode in Python 3).
However, you won't get a "numeric order" for the entries that are
numbers -- the sort is lexicographical {Using 2.7}... Is the requirement
that strings of digits are to be sorted AS integers rather than
lexicographic?
>>> data = [ '1', '10', '101', '3', '40', '31', '13', '2', '2000', '0015',
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data)
['0015', '1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31',
'40', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> data = [ 1, 10, 101, 3, 40, 31, 13, 2, 2000, 0015,
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data)
[1, 2, 3, 10, 13, 13, 31, 40, 101, 2000, '1a', '222 bb', 'a', 'a1',
'ab', 'acd', 'b a 4', 'bcd']
>>> #note how "0015" sorted first, but 0015 is an octal value equal to decimal 13
>>> #also note how the string representation put sorted by the first character
>>> #but the mixed list sorted by ascending integer value
>>>
>
>
> Possibly there is a better way but for Python 2.7 this gives the
> required result
>
> Python 2.7.3 (default, Sep 26 2012, 21:51:14)
>
> >>> sorted(int(x) if x.isdigit() else x for x in n+s)
> [1, 2, 3, 10, 13, 31, 40, 101, 2000, '1a', '222 bb', 'a', 'a1', 'ab',
> 'acd', 'b a 4', 'bcd']
>
This, however, is returning a mix of INTEGER and STRING. Is that
what is really wanted? Your input, I presume, was the "all string"
version -- shouldn't the output also be all string? (Okay -- that IS
your next item)
>
> [str(x) for x in sorted(int(x) if x.isdigit() else x for x in n+s)]
> ['1', '2', '3', '10', '13', '31', '40', '101', '2000', '1a', '222 bb',
> 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>
>
> But not for Python 3
> Python 3.2.3 (default, Oct 19 2012, 19:53:16)
>
> >>> sorted(n+s)
> ['1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31', '40',
> 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>
> >>> sorted(int(x) if x.isdigit() else x for x in n+s)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unorderable types: str() < int()
> >>>
>
> The best I can think of is to split the input sequence into two lists,
> sort each and then join them.
Why -- I doubt Python 3.x .sort() and sorted() have removed the
optional key and cmp keywords.
Just supply your own comparison function that handles mixed types
and you should be able to replicate the 2.7 process.
Something like (untested on Python 3.x)
>>> def cmpr(l, r):
... if type(l) == type(r):
... if l < r: return -1
... if l == r: return 0
... if l > r: return 1
... else:
... if type(l) < type(r): return -1
... if type(l) > type(r): return 1
... # equality is covered above block
...
>>> data = [ '1', '10', '101', '3', '40', '31', '13', '2', '2000', '0015',
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data, cmp=cmpr)
['0015', '1', '10', '101', '13', '1a', '2', '2000', '222 bb', '3', '31',
'40', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> data = [ 1, 10, 101, 3, 40, 31, 13, 2, 2000, 0015,
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data, cmp=cmpr)
[1, 2, 3, 10, 13, 13, 31, 40, 101, 2000, '1a', '222 bb', 'a', 'a1',
'ab', 'acd', 'b a 4', 'bcd']
>>>
You could even expand the cmpr() function to incorporate the
conversion of decimal strings into numbers...
>>> def cmpr(l, r):
... myL = l if type(l) == type("") and not l.isdigit() else int(l)
... myR = r if type(r) == type("") and not r.isdigit() else int(r)
... if type(myL) == type(myR):
... if myL < myR: return -1
... if myL == myR: return 0
... if myL > myR: return 1
... else:
... if type(myL) < type(myR): return -1
... if type(myL) > type(myR): return 1
...
>>> data = [ '1', '10', '101', '3', '40', '31', '13', '2', '2000', '0015',
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data, cmp=cmpr)
['1', '2', '3', '10', '13', '0015', '31', '40', '101', '2000', '1a',
'222 bb', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> data = [ 1, 10, 101, 3, 40, 31, 13, 2, 2000, 0015,
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4' ]
>>> sorted(data, cmp=cmpr)
[1, 2, 3, 10, 13, 13, 31, 40, 101, 2000, '1a', '222 bb', 'a', 'a1',
'ab', 'acd', 'b a 4', 'bcd']
>>>
NOTE how the version with all character data has sorted the pure
decimals numerically while still returning them as original strings.
HOWEVER -- this probably needs to be expanded if you might have floating
point STRINGS... Observe:
>>> data = [ '1', '10', '101', '3', '40', '31', '13', '2', '2000', '0015',
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4',
'3.14153', '2.718E0' ]
>>> sorted(data, cmp=cmpr)
['1', '2', '3', '10', '13', '0015', '31', '40', '101', '2000', '1a',
'2.718E0', '222 bb', '3.14153', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> data = [ '1', '10', '101', '3', '40', '31', '13', '2', '2000', '0015',
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4',
3.14153, 2.718E0 ]
>>> sorted(data, cmp=cmpr)
['1', '2', 2.718, '3', 3.14153, '10', '13', '0015', '31', '40', '101',
'2000', '1a', '222 bb', 'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>> data = [ 1, 10, 101, 3, 40, 31, 13, 2, 2000, 0015,
... 'a', 'ab', 'acd', 'bcd', '1a', 'a1', '222 bb', 'b a 4',
3.14153, 2.718E0]
>>> sorted(data, cmp=cmpr)
[1, 2, 2.718, 3, 3.14153, 10, 13, 13, 31, 40, 101, 2000, '1a', '222 bb',
'a', 'a1', 'ab', 'acd', 'b a 4', 'bcd']
>>>
"3.14153" and "2.718E0" do not pass the .isdigit() test, so remain
as strings for sort purposes.
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
sort order for strings of digits djc <djc@kangoo.invalid> - 2012-10-31 15:17 +0000
Re: sort order for strings of digits Hans Mulder <hansmu@xs4all.nl> - 2012-10-31 16:31 +0100
Re: sort order for strings of digits Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-31 09:44 -0600
Re: sort order for strings of digits Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-10-31 14:17 -0400
Re: sort order for strings of digits Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-31 21:33 +0000
Re: sort order for strings of digits Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-10-31 19:05 -0400
Re: sort order for strings of digits Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 23:44 +0000
Re: sort order for strings of digits Chris Angelico <rosuav@gmail.com> - 2012-11-01 11:53 +1100
Re: sort order for strings of digits Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-11-02 00:27 +0000
Re: sort order for strings of digits Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-31 23:09 +0000
Re: sort order for strings of digits DJC <djc@news.invalid> - 2012-10-31 23:45 +0000
Re: sort order for strings of digits Arnaud Delobelle <arnodel@gmail.com> - 2012-11-01 00:59 +0000
Re: sort order for strings of digits Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-11-01 00:30 +0000
Re: sort order for strings of digits wxjmfauth@gmail.com - 2012-11-01 01:52 -0700
csiph-web