Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #86345 > unrolled thread
| Started by | Roy Smith <roy@panix.com> |
|---|---|
| First post | 2015-02-24 13:34 -0800 |
| Last post | 2015-02-26 09:33 +0200 |
| Articles | 20 on this page of 44 — 22 participants |
Back to article view | Back to comp.lang.python
Bug in timsort!? Roy Smith <roy@panix.com> - 2015-02-24 13:34 -0800
Re: Bug in timsort!? Zachary Ware <zachary.ware+pylist@gmail.com> - 2015-02-24 15:40 -0600
Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-24 21:49 +0000
Re: Bug in timsort!? sohcahtoa82@gmail.com - 2015-02-24 14:36 -0800
Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 00:38 +0000
Re: Bug in timsort!? Grant Edwards <invalid@invalid.invalid> - 2015-02-24 22:45 +0000
Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-24 15:59 -0700
Re: Bug in timsort!? Robert Kern <robert.kern@gmail.com> - 2015-02-25 11:34 +0000
Re: Bug in timsort!? Grant Edwards <invalid@invalid.invalid> - 2015-02-25 15:49 +0000
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 10:33 +1100
Re: Bug in timsort!? MRAB <python@mrabarnett.plus.com> - 2015-02-24 23:38 +0000
Re: Bug in timsort!? Skip Montanaro <skip.montanaro@gmail.com> - 2015-02-24 17:50 -0600
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 11:07 +1100
Re: Bug in timsort!? Paddy <paddy3118@gmail.com> - 2015-02-25 01:10 -0800
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 21:11 +1100
Re: Bug in timsort!? Chris Kaynor <ckaynor@zindagigames.com> - 2015-02-24 16:38 -0800
Re: Bug in timsort!? Terry Reedy <tjreedy@udel.edu> - 2015-02-24 21:52 -0500
Re: Bug in timsort!? Dave Angel <davea@davea.name> - 2015-02-24 21:41 -0500
Re: Bug in timsort!? Ned Deily <nad@acm.org> - 2015-02-25 01:04 -0800
Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 14:58 +0100
Re: Bug in timsort!? alister <alister.nospam.ware@ntlworld.com> - 2015-02-25 15:03 +0000
Re: Bug in timsort!? Zachary Ware <zachary.ware+pylist@gmail.com> - 2015-02-25 09:23 -0600
Re: Bug in timsort!? Terry Reedy <tjreedy@udel.edu> - 2015-02-25 16:08 -0500
Re: Bug in timsort!? Roy Smith <roy@panix.com> - 2015-02-25 19:12 -0500
Re: Bug in timsort!? Jonas Wielicki <jonas@wielicki.name> - 2015-02-25 15:07 +0100
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 01:33 +1100
Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 16:05 +0100
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 02:11 +1100
Re: Bug in timsort!? Peter Otten <__peter__@web.de> - 2015-02-25 17:04 +0100
Re: Bug in timsort!? Mario Figueiredo <marfig@gmail.com> - 2015-02-25 18:22 +0100
Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 19:41 +0100
Re: Bug in timsort!? Jonas Wielicki <jonas@wielicki.name> - 2015-02-25 19:46 +0100
Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 20:07 +0000
Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 17:44 +0100
Re: Bug in timsort!? Mario Figueiredo <marfig@gmail.com> - 2015-02-25 18:29 +0100
Re: Bug in timsort!? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-26 10:13 +1100
Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 22:51 -0700
Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:50 +0000
Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:55 +0000
Re: Bug in timsort!? Chris Kaynor <ckaynor@zindagigames.com> - 2015-02-25 08:56 -0800
Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 04:18 +1100
Re: Bug in timsort!? Peter Otten <__peter__@web.de> - 2015-02-25 18:21 +0100
Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 12:42 -0700
Re: Bug in timsort!? Serhiy Storchaka <storchaka@gmail.com> - 2015-02-26 09:33 +0200
Page 1 of 3 [1] 2 3 Next page →
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2015-02-24 13:34 -0800 |
| Subject | Bug in timsort!? |
| Message-ID | <1cf84559-3a63-4799-a879-ae8e513d387e@googlegroups.com> |
http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
[toc] | [next] | [standalone]
| From | Zachary Ware <zachary.ware+pylist@gmail.com> |
|---|---|
| Date | 2015-02-24 15:40 -0600 |
| Message-ID | <mailman.19149.1424814070.18130.python-list@python.org> |
| In reply to | #86345 |
On Tue, Feb 24, 2015 at 3:34 PM, Roy Smith <roy@panix.com> wrote: > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ http://bugs.python.org/issue23515 Note that the article does mention that Python is not vulnerable due to this bug (and best I can tell has no behavioral issues); no computer currently in existence can create a large enough array to cause a problem. -- Zach
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-02-24 21:49 +0000 |
| Message-ID | <mailman.19150.1424814603.18130.python-list@python.org> |
| In reply to | #86345 |
On 24/02/2015 21:34, Roy Smith wrote: > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ > As you can clearly no longer rely on Python it looks like, after a long and loving relationship, I'm forced into moving on. PHP, Applescript, Javascript or back to C or even CORAL 66, what is your (plural) recommendation(s)? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | sohcahtoa82@gmail.com |
|---|---|
| Date | 2015-02-24 14:36 -0800 |
| Message-ID | <d262b9be-957d-4d8c-a751-f33b3d30d2e2@googlegroups.com> |
| In reply to | #86347 |
On Tuesday, February 24, 2015 at 1:50:15 PM UTC-8, Mark Lawrence wrote: > On 24/02/2015 21:34, Roy Smith wrote: > > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ > > > > As you can clearly no longer rely on Python it looks like, after a long > and loving relationship, I'm forced into moving on. PHP, Applescript, > Javascript or back to C or even CORAL 66, what is your (plural) > recommendation(s)? > > -- > My fellow Pythonistas, ask not what our language can do for you, ask > what you can do for our language. > > Mark Lawrence Personally, I'm going with BASIC being interpreted by a JavaScript script being fed from a PHP script running on an out-dated version of Apache. It includes an in-browser IDE that will be written as an ActiveX control and will require Internet Explorer 6. An Adobe Flash version is in the works to allow it to run on modern browsers.
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-02-25 00:38 +0000 |
| Message-ID | <mailman.19161.1424824778.18130.python-list@python.org> |
| In reply to | #86349 |
On 24/02/2015 22:36, sohcahtoa82@gmail.com wrote: > On Tuesday, February 24, 2015 at 1:50:15 PM UTC-8, Mark Lawrence wrote: >> On 24/02/2015 21:34, Roy Smith wrote: >>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ >>> >> >> As you can clearly no longer rely on Python it looks like, after a long >> and loving relationship, I'm forced into moving on. PHP, Applescript, >> Javascript or back to C or even CORAL 66, what is your (plural) >> recommendation(s)? >> >> -- >> My fellow Pythonistas, ask not what our language can do for you, ask >> what you can do for our language. >> >> Mark Lawrence > > Personally, I'm going with BASIC being interpreted by a JavaScript script being fed from a PHP script running on an out-dated version of Apache. > > It includes an in-browser IDE that will be written as an ActiveX control and will require Internet Explorer 6. An Adobe Flash version is in the works to allow it to run on modern browsers. > Thanks for your help, much appreciated :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2015-02-24 22:45 +0000 |
| Message-ID | <mciuv6$r24$1@reader1.panix.com> |
| In reply to | #86345 |
On 2015-02-24, Roy Smith <roy@panix.com> wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
I don't get it.
3.2 Corrected Python merge_collapse function
merge_collapse(MergeState *ms)
{
struct s_slice *p = ms->pending;
assert(ms);
while (ms->n > 1) {
Py_ssize_t n = ms->n - 2;
if ( n > 0 && p[n-1].len <= p[n].len + p[n+1].len
|| (n-1 > 0 && p[n-2].len <= p[n].len + p[n-1].len)) {
if (p[n-1].len < p[n+1].len)
--n;
if (merge_at(ms, n) < 0)
return -1;
}
else if (p[n].len <= p[n+1].len) {
if (merge_at(ms, n) < 0)
return -1;
}
else
break;
}
return 0;
}
Or does "Python function" mean something else in this context?
--
Grant Edwards grant.b.edwards Yow! I request a weekend in
at Havana with Phil Silvers!
gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-02-24 15:59 -0700 |
| Message-ID | <mailman.19151.1424819247.18130.python-list@python.org> |
| In reply to | #86350 |
On Tue, Feb 24, 2015 at 3:45 PM, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>
>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>
> I don't get it.
>
> 3.2 Corrected Python merge_collapse function
>
> merge_collapse(MergeState *ms)
> {
> struct s_slice *p = ms->pending;
>
> assert(ms);
> while (ms->n > 1) {
> Py_ssize_t n = ms->n - 2;
> if ( n > 0 && p[n-1].len <= p[n].len + p[n+1].len
> || (n-1 > 0 && p[n-2].len <= p[n].len + p[n-1].len)) {
> if (p[n-1].len < p[n+1].len)
> --n;
> if (merge_at(ms, n) < 0)
> return -1;
> }
> else if (p[n].len <= p[n+1].len) {
> if (merge_at(ms, n) < 0)
> return -1;
> }
> else
> break;
> }
> return 0;
> }
>
> Or does "Python function" mean something else in this context?
Recall that CPython is implemented in C. ;-)
[toc] | [prev] | [next] | [standalone]
| From | Robert Kern <robert.kern@gmail.com> |
|---|---|
| Date | 2015-02-25 11:34 +0000 |
| Message-ID | <mailman.19179.1424864077.18130.python-list@python.org> |
| In reply to | #86350 |
On 2015-02-24 22:45, Grant Edwards wrote:
> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>
>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>
> I don't get it.
>
> 3.2 Corrected Python merge_collapse function
>
> merge_collapse(MergeState *ms)
> {
> struct s_slice *p = ms->pending;
>
> assert(ms);
> while (ms->n > 1) {
> Py_ssize_t n = ms->n - 2;
> if ( n > 0 && p[n-1].len <= p[n].len + p[n+1].len
> || (n-1 > 0 && p[n-2].len <= p[n].len + p[n-1].len)) {
> if (p[n-1].len < p[n+1].len)
> --n;
> if (merge_at(ms, n) < 0)
> return -1;
> }
> else if (p[n].len <= p[n+1].len) {
> if (merge_at(ms, n) < 0)
> return -1;
> }
> else
> break;
> }
> return 0;
> }
>
> Or does "Python function" mean something else in this context?
"Corrected merge_collapse function [from the Python implementation of TimSort]"
as opposed to the Java implementation which was also discussed.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2015-02-25 15:49 +0000 |
| Message-ID | <mckqto$mig$1@reader1.panix.com> |
| In reply to | #86395 |
On 2015-02-25, Robert Kern <robert.kern@gmail.com> wrote:
> On 2015-02-24 22:45, Grant Edwards wrote:
>> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>>
>>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>>
>> I don't get it.
>>
>> 3.2 Corrected Python merge_collapse function
[C code elided]
>> Or does "Python function" mean something else in this context?
>
> "Corrected merge_collapse function [from the Python implementation of TimSort]"
> as opposed to the Java implementation which was also discussed.
Yes, I get it now. But, when I read "Python function" or "Python
implementation of <foo>" that _to_me_ refers to someting written in
_Python_.
If you're talking about _CPython_ code, then you say "CPython
function" or "CPython implementation of <foo>".
--
Grant Edwards grant.b.edwards Yow! I hope the
at ``Eurythmics'' practice
gmail.com birth control ...
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-25 10:33 +1100 |
| Message-ID | <mailman.19156.1424820786.18130.python-list@python.org> |
| In reply to | #86345 |
On Wed, Feb 25, 2015 at 8:34 AM, Roy Smith <roy@panix.com> wrote: > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ The post links to: http://svn.python.org/projects/python/trunk/Objects/listobject.c Is that still valid and current? It says svn, but does it pop through to the Mercurial repo where the _real_ CPython code is stored? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2015-02-24 23:38 +0000 |
| Message-ID | <mailman.19157.1424821123.18130.python-list@python.org> |
| In reply to | #86345 |
On 2015-02-24 21:40, Zachary Ware wrote: > On Tue, Feb 24, 2015 at 3:34 PM, Roy Smith <roy@panix.com> wrote: >> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ > > http://bugs.python.org/issue23515 > > Note that the article does mention that Python is not vulnerable due > to this bug (and best I can tell has no behavioral issues); no > computer currently in existence can create a large enough array to > cause a problem. > I think the key word here is "currently".
[toc] | [prev] | [next] | [standalone]
| From | Skip Montanaro <skip.montanaro@gmail.com> |
|---|---|
| Date | 2015-02-24 17:50 -0600 |
| Message-ID | <mailman.19158.1424821845.18130.python-list@python.org> |
| In reply to | #86345 |
On Tue, Feb 24, 2015 at 5:38 PM, MRAB <python@mrabarnett.plus.com> wrote: > I think the key word here is "currently". Sure, but it's not like you have to put out security updates tomorrow. Even if/when we get to the point where machines can hold an array of 2**49 elements, I suspect people won't be using straight Python to wrangle them. They will likely use other languages, and if using Python, will be using something to accelerate things (PyPy, numpy, etc). Does, for example, numpy's sort() function suffer from the same flaw? Does PyPy use Timsort? Skip
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-25 11:07 +1100 |
| Message-ID | <mailman.19159.1424822880.18130.python-list@python.org> |
| In reply to | #86345 |
On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro <skip.montanaro@gmail.com> wrote: > Even if/when we get to the point where machines can hold an array of > 2**49 elements, I suspect people won't be using straight Python to > wrangle them. Looking just at CPython, what is the absolute minimum storage space for a massive list like that? My guess is that a 64-bit CPython might get as good as eight bytes per element; playing around with smaller figures (up to circa a million elements) shows it ranging from 8 to 9 bytes per element, bottoming out at just a smidge over 8, presumably at the moments when there's no slack space (there's a fixed overhead, which gets diminishingly significant). So an array of 2**49 elements would take 2**52 bytes (4 petabytes) of storage - addressable, to be sure, but an awful lot to be sorting. Would it be sufficient to stick a comment into the source saying "This may have problems with lists in excess of 2**49 elements", and leave it until that's actually likely to happen? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paddy <paddy3118@gmail.com> |
|---|---|
| Date | 2015-02-25 01:10 -0800 |
| Message-ID | <a61e6bac-fa4b-4f97-bb29-ee665578d1f1@googlegroups.com> |
| In reply to | #86363 |
On Wednesday, 25 February 2015 00:08:32 UTC, Chris Angelico wrote: > On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro > <skip.mon--@gmail.com> wrote: > > Even if/when we get to the point where machines can hold an array of > > 2**49 elements, I suspect people won't be using straight Python to > > wrangle them. > <<snip>> > > Would it be sufficient to stick a comment into the source saying "This > may have problems with lists in excess of 2**49 elements", and leave > it until that's actually likely to happen? > > ChrisA If we are given a proven fix with little downside then if we are to touch the source then we should MAKE THE FIX. To do otherwise would send the wrong signal to those that depend on the perceived integrity of the C-Python developers.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-02-25 21:11 +1100 |
| Message-ID | <mailman.19178.1424859125.18130.python-list@python.org> |
| In reply to | #86389 |
On Wed, Feb 25, 2015 at 8:10 PM, Paddy <paddy3118@gmail.com> wrote: > If we are given a proven fix *with little downside* then if we are to touch the source then we should MAKE THE FIX. (emphasis mine) That's really the question, though. I'm in no position to evaluate the patch and ascertain whether the fix does indeed have little or no downside. But as the tracker issue proves, a greater mind than mine has decided that it's to be applied, which means that, yes, it's worth doing. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Chris Kaynor <ckaynor@zindagigames.com> |
|---|---|
| Date | 2015-02-24 16:38 -0800 |
| Message-ID | <mailman.19160.1424824756.18130.python-list@python.org> |
| In reply to | #86345 |
[Multipart message — attachments visible in raw view] — view raw
On Tue, Feb 24, 2015 at 4:07 PM, Chris Angelico <rosuav@gmail.com> wrote: > On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro > <skip.montanaro@gmail.com> wrote: > > Even if/when we get to the point where machines can hold an array of > > 2**49 elements, I suspect people won't be using straight Python to > > wrangle them. > > Looking just at CPython, what is the absolute minimum storage space > for a massive list like that? My guess is that a 64-bit CPython might > get as good as eight bytes per element; playing around with smaller > figures (up to circa a million elements) shows it ranging from 8 to 9 > bytes per element, bottoming out at just a smidge over 8, presumably > at the moments when there's no slack space (there's a fixed overhead, > which gets diminishingly significant). So an array of 2**49 elements > would take 2**52 bytes (4 petabytes) of storage - addressable, to be > sure, but an awful lot to be sorting. > > Would it be sufficient to stick a comment into the source saying "This > may have problems with lists in excess of 2**49 elements", and leave > it until that's actually likely to happen? > I would agree that it will be quite a while as it stands before the bug appears, so documenting it MAY be appropriate, as it is likely to be far enough in the future that a number of unforeseen things may block the bug from being an issue (TimSort may be replaced by something better, CPython could die, etc). That said, from what I understand, their proposed fix would be reasonably easy to use, and have minimal cost, so it may be worthwhile to use just because the issue is likely to be in use for quite a while before somebody remembers the issue exists. Once somebody does hit it, it may take a while to realize that TimSort is at fault, as it will only occur on some data sets (my understanding is that the elements must be in certain configurations with at-least the minimum number), so it will likely appear as just a random crash occurring. If it were fully up to me, I would do one of the following, in order of preference: 1) Implement and test the fix provided on the blog. This requires the most work due to licensing, reviewing, and testing. 2) Add a noisy warning/error to the code when sorting lists large enough to potentially hit the error (probably 2**48 to leave some wiggle room). This is much easier to do, but merely pushes the can down the road to be dealt with later. It does still make it obvious what is going long when it finally breaks. 3) Add a comment explaining the issue (likely linking back to the blog). While by far the easiest "solution", again, this merely pushes the can down the road, and it may be very non-obvious what went wrong when it breaks, despite the comment. Chris
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-02-24 21:52 -0500 |
| Message-ID | <mailman.19165.1424832769.18130.python-list@python.org> |
| In reply to | #86345 |
On 2/24/2015 4:34 PM, Roy Smith wrote: > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ Tim Peters is aware of this and opened http://bugs.python.org/issue23515 -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-02-24 21:41 -0500 |
| Message-ID | <mailman.19164.1424832117.18130.python-list@python.org> |
| In reply to | #86345 |
On 02/24/2015 07:07 PM, Chris Angelico wrote: > On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro > <skip.montanaro@gmail.com> wrote: >> Even if/when we get to the point where machines can hold an array of >> 2**49 elements, I suspect people won't be using straight Python to >> wrangle them. > > Looking just at CPython, what is the absolute minimum storage space > for a massive list like that? My guess is that a 64-bit CPython might > get as good as eight bytes per element; playing around with smaller > figures (up to circa a million elements) shows it ranging from 8 to 9 > bytes per element, bottoming out at just a smidge over 8, presumably > at the moments when there's no slack space (there's a fixed overhead, > which gets diminishingly significant). So an array of 2**49 elements > would take 2**52 bytes (4 petabytes) of storage - addressable, to be > sure, but an awful lot to be sorting. > > Would it be sufficient to stick a comment into the source saying "This > may have problems with lists in excess of 2**49 elements", and leave > it until that's actually likely to happen? > > ChrisA > One could do much better than that. Put in a test for the length of the list, and if it's too large for the current implementation, throw an exception. Much better than a comment. -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Ned Deily <nad@acm.org> |
|---|---|
| Date | 2015-02-25 01:04 -0800 |
| Message-ID | <mailman.19176.1424855099.18130.python-list@python.org> |
| In reply to | #86345 |
In article <CAPTjJmpQw-23mzgYHFjUC+y-C=vVm1dKePSpjG9+J3H8tHO7dA@mail.gmail.com>, Chris Angelico <rosuav@gmail.com> wrote: > The post links to: > > http://svn.python.org/projects/python/trunk/Objects/listobject.c > > Is that still valid and current? It says svn, but does it pop through > to the Mercurial repo where the _real_ CPython code is stored? In general, no. The python svn repo linked to above is frozen in time: for most branches, as of the conversion to hg in 2011. There was at least one discussion thread soon thereafter about what to do about it: http://comments.gmane.org/gmane.comp.python.devel/125252 It is still true that at least one repo in the svn.python.org is actively maintained (e.g. "external"). I *think* all the others are now obsolete. http://svn.python.org/view/ -- Ned Deily, nad@acm.org
[toc] | [prev] | [next] | [standalone]
| From | Sturla Molden <sturla.molden@gmail.com> |
|---|---|
| Date | 2015-02-25 14:58 +0100 |
| Message-ID | <mailman.19183.1424872728.18130.python-list@python.org> |
| In reply to | #86345 |
On 24/02/15 22:34, Roy Smith wrote: > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/ > This is awful. It is broken for arrays longer than 2**49 elements. With 8 bytes per PyObject* pointer this is >4096 terabytes of RAM. I don't see how we can fix this in time. Oh yes, and they mention that TimSort is used on billions of devices due to Android mobile phones. This is clearly very relevant for mobile phones. Next thing you know your litte Samsung Galaxy with more than 4096 terabytes breaks down from a stack overflow in TimSort. Sturla
[toc] | [prev] | [next] | [standalone]
Page 1 of 3 [1] 2 3 Next page →
Back to top | Article view | comp.lang.python
csiph-web