Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #86345 > unrolled thread

Bug in timsort!?

Started byRoy Smith <roy@panix.com>
First post2015-02-24 13:34 -0800
Last post2015-02-26 09:33 +0200
Articles 20 on this page of 44 — 22 participants

Back to article view | Back to comp.lang.python


Contents

  Bug in timsort!? Roy Smith <roy@panix.com> - 2015-02-24 13:34 -0800
    Re: Bug in timsort!? Zachary Ware <zachary.ware+pylist@gmail.com> - 2015-02-24 15:40 -0600
    Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-24 21:49 +0000
      Re: Bug in timsort!? sohcahtoa82@gmail.com - 2015-02-24 14:36 -0800
        Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 00:38 +0000
    Re: Bug in timsort!? Grant Edwards <invalid@invalid.invalid> - 2015-02-24 22:45 +0000
      Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-24 15:59 -0700
      Re: Bug in timsort!? Robert Kern <robert.kern@gmail.com> - 2015-02-25 11:34 +0000
        Re: Bug in timsort!? Grant Edwards <invalid@invalid.invalid> - 2015-02-25 15:49 +0000
    Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 10:33 +1100
    Re: Bug in timsort!? MRAB <python@mrabarnett.plus.com> - 2015-02-24 23:38 +0000
    Re: Bug in timsort!? Skip Montanaro <skip.montanaro@gmail.com> - 2015-02-24 17:50 -0600
    Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 11:07 +1100
      Re: Bug in timsort!? Paddy <paddy3118@gmail.com> - 2015-02-25 01:10 -0800
        Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-25 21:11 +1100
    Re: Bug in timsort!? Chris Kaynor <ckaynor@zindagigames.com> - 2015-02-24 16:38 -0800
    Re: Bug in timsort!? Terry Reedy <tjreedy@udel.edu> - 2015-02-24 21:52 -0500
    Re: Bug in timsort!? Dave Angel <davea@davea.name> - 2015-02-24 21:41 -0500
    Re: Bug in timsort!? Ned Deily <nad@acm.org> - 2015-02-25 01:04 -0800
    Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 14:58 +0100
      Re: Bug in timsort!? alister <alister.nospam.ware@ntlworld.com> - 2015-02-25 15:03 +0000
        Re: Bug in timsort!? Zachary Ware <zachary.ware+pylist@gmail.com> - 2015-02-25 09:23 -0600
        Re: Bug in timsort!? Terry Reedy <tjreedy@udel.edu> - 2015-02-25 16:08 -0500
      Re: Bug in timsort!? Roy Smith <roy@panix.com> - 2015-02-25 19:12 -0500
    Re: Bug in timsort!? Jonas Wielicki <jonas@wielicki.name> - 2015-02-25 15:07 +0100
    Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 01:33 +1100
    Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 16:05 +0100
    Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 02:11 +1100
    Re: Bug in timsort!? Peter Otten <__peter__@web.de> - 2015-02-25 17:04 +0100
      Re: Bug in timsort!? Mario Figueiredo <marfig@gmail.com> - 2015-02-25 18:22 +0100
        Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 19:41 +0100
        Re: Bug in timsort!? Jonas Wielicki <jonas@wielicki.name> - 2015-02-25 19:46 +0100
        Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 20:07 +0000
    Re: Bug in timsort!? Sturla Molden <sturla.molden@gmail.com> - 2015-02-25 17:44 +0100
      Re: Bug in timsort!? Mario Figueiredo <marfig@gmail.com> - 2015-02-25 18:29 +0100
      Re: Bug in timsort!? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-02-26 10:13 +1100
        Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 22:51 -0700
    Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:50 +0000
    Re: Bug in timsort!? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-02-25 16:55 +0000
    Re: Bug in timsort!? Chris Kaynor <ckaynor@zindagigames.com> - 2015-02-25 08:56 -0800
    Re: Bug in timsort!? Chris Angelico <rosuav@gmail.com> - 2015-02-26 04:18 +1100
    Re: Bug in timsort!? Peter Otten <__peter__@web.de> - 2015-02-25 18:21 +0100
    Re: Bug in timsort!? Ian Kelly <ian.g.kelly@gmail.com> - 2015-02-25 12:42 -0700
    Re: Bug in timsort!? Serhiy Storchaka <storchaka@gmail.com> - 2015-02-26 09:33 +0200

Page 1 of 3  [1] 2 3  Next page →


#86345 — Bug in timsort!?

FromRoy Smith <roy@panix.com>
Date2015-02-24 13:34 -0800
SubjectBug in timsort!?
Message-ID<1cf84559-3a63-4799-a879-ae8e513d387e@googlegroups.com>
http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

[toc] | [next] | [standalone]


#86346

FromZachary Ware <zachary.ware+pylist@gmail.com>
Date2015-02-24 15:40 -0600
Message-ID<mailman.19149.1424814070.18130.python-list@python.org>
In reply to#86345
On Tue, Feb 24, 2015 at 3:34 PM, Roy Smith <roy@panix.com> wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

http://bugs.python.org/issue23515

Note that the article does mention that Python is not vulnerable due
to this bug (and best I can tell has no behavioral issues); no
computer currently in existence can create a large enough array to
cause a problem.

-- 
Zach

[toc] | [prev] | [next] | [standalone]


#86347

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-02-24 21:49 +0000
Message-ID<mailman.19150.1424814603.18130.python-list@python.org>
In reply to#86345
On 24/02/2015 21:34, Roy Smith wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>

As you can clearly no longer rely on Python it looks like, after a long 
and loving relationship, I'm forced into moving on.  PHP, Applescript, 
Javascript or back to C or even CORAL 66, what is your (plural) 
recommendation(s)?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#86349

Fromsohcahtoa82@gmail.com
Date2015-02-24 14:36 -0800
Message-ID<d262b9be-957d-4d8c-a751-f33b3d30d2e2@googlegroups.com>
In reply to#86347
On Tuesday, February 24, 2015 at 1:50:15 PM UTC-8, Mark Lawrence wrote:
> On 24/02/2015 21:34, Roy Smith wrote:
> > http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
> >
> 
> As you can clearly no longer rely on Python it looks like, after a long 
> and loving relationship, I'm forced into moving on.  PHP, Applescript, 
> Javascript or back to C or even CORAL 66, what is your (plural) 
> recommendation(s)?
> 
> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence

Personally, I'm going with BASIC being interpreted by a JavaScript script being fed from a PHP script running on an out-dated version of Apache.

It includes an in-browser IDE that will be written as an ActiveX control and will require Internet Explorer 6.  An Adobe Flash version is in the works to allow it to run on modern browsers.

[toc] | [prev] | [next] | [standalone]


#86365

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-02-25 00:38 +0000
Message-ID<mailman.19161.1424824778.18130.python-list@python.org>
In reply to#86349
On 24/02/2015 22:36, sohcahtoa82@gmail.com wrote:
> On Tuesday, February 24, 2015 at 1:50:15 PM UTC-8, Mark Lawrence wrote:
>> On 24/02/2015 21:34, Roy Smith wrote:
>>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>>>
>>
>> As you can clearly no longer rely on Python it looks like, after a long
>> and loving relationship, I'm forced into moving on.  PHP, Applescript,
>> Javascript or back to C or even CORAL 66, what is your (plural)
>> recommendation(s)?
>>
>> --
>> My fellow Pythonistas, ask not what our language can do for you, ask
>> what you can do for our language.
>>
>> Mark Lawrence
>
> Personally, I'm going with BASIC being interpreted by a JavaScript script being fed from a PHP script running on an out-dated version of Apache.
>
> It includes an in-browser IDE that will be written as an ActiveX control and will require Internet Explorer 6.  An Adobe Flash version is in the works to allow it to run on modern browsers.
>

Thanks for your help, much appreciated :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#86350

FromGrant Edwards <invalid@invalid.invalid>
Date2015-02-24 22:45 +0000
Message-ID<mciuv6$r24$1@reader1.panix.com>
In reply to#86345
On 2015-02-24, Roy Smith <roy@panix.com> wrote:

> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

I don't get it. 

    3.2 Corrected Python merge_collapse function
    
    merge_collapse(MergeState *ms)
    {
        struct s_slice *p = ms->pending;
    
        assert(ms);
        while (ms->n > 1) {
            Py_ssize_t n = ms->n - 2;
            if (     n > 0   && p[n-1].len <= p[n].len + p[n+1].len
                || (n-1 > 0 &&  p[n-2].len <= p[n].len + p[n-1].len)) {
                if (p[n-1].len < p[n+1].len)
                    --n;
                if (merge_at(ms, n) < 0)
                    return -1;
            }
            else if (p[n].len <= p[n+1].len) {
                     if (merge_at(ms, n) < 0)
                            return -1;
            }
            else
                break;
        }
        return 0;
    }    

Or does "Python function" mean something else in this context?



    
-- 
Grant Edwards               grant.b.edwards        Yow! I request a weekend in
                                  at               Havana with Phil Silvers!
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#86352

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-02-24 15:59 -0700
Message-ID<mailman.19151.1424819247.18130.python-list@python.org>
In reply to#86350
On Tue, Feb 24, 2015 at 3:45 PM, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>
>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>
> I don't get it.
>
>     3.2 Corrected Python merge_collapse function
>
>     merge_collapse(MergeState *ms)
>     {
>         struct s_slice *p = ms->pending;
>
>         assert(ms);
>         while (ms->n > 1) {
>             Py_ssize_t n = ms->n - 2;
>             if (     n > 0   && p[n-1].len <= p[n].len + p[n+1].len
>                 || (n-1 > 0 &&  p[n-2].len <= p[n].len + p[n-1].len)) {
>                 if (p[n-1].len < p[n+1].len)
>                     --n;
>                 if (merge_at(ms, n) < 0)
>                     return -1;
>             }
>             else if (p[n].len <= p[n+1].len) {
>                      if (merge_at(ms, n) < 0)
>                             return -1;
>             }
>             else
>                 break;
>         }
>         return 0;
>     }
>
> Or does "Python function" mean something else in this context?

Recall that CPython is implemented in C. ;-)

[toc] | [prev] | [next] | [standalone]


#86395

FromRobert Kern <robert.kern@gmail.com>
Date2015-02-25 11:34 +0000
Message-ID<mailman.19179.1424864077.18130.python-list@python.org>
In reply to#86350
On 2015-02-24 22:45, Grant Edwards wrote:
> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>
>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>
> I don't get it.
>
>      3.2 Corrected Python merge_collapse function
>
>      merge_collapse(MergeState *ms)
>      {
>          struct s_slice *p = ms->pending;
>
>          assert(ms);
>          while (ms->n > 1) {
>              Py_ssize_t n = ms->n - 2;
>              if (     n > 0   && p[n-1].len <= p[n].len + p[n+1].len
>                  || (n-1 > 0 &&  p[n-2].len <= p[n].len + p[n-1].len)) {
>                  if (p[n-1].len < p[n+1].len)
>                      --n;
>                  if (merge_at(ms, n) < 0)
>                      return -1;
>              }
>              else if (p[n].len <= p[n+1].len) {
>                       if (merge_at(ms, n) < 0)
>                              return -1;
>              }
>              else
>                  break;
>          }
>          return 0;
>      }
>
> Or does "Python function" mean something else in this context?

"Corrected merge_collapse function [from the Python implementation of TimSort]" 
as opposed to the Java implementation which was also discussed.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#86409

FromGrant Edwards <invalid@invalid.invalid>
Date2015-02-25 15:49 +0000
Message-ID<mckqto$mig$1@reader1.panix.com>
In reply to#86395
On 2015-02-25, Robert Kern <robert.kern@gmail.com> wrote:
> On 2015-02-24 22:45, Grant Edwards wrote:
>> On 2015-02-24, Roy Smith <roy@panix.com> wrote:
>>
>>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>>
>> I don't get it.
>>
>>      3.2 Corrected Python merge_collapse function

[C code elided]

>> Or does "Python function" mean something else in this context?
>
> "Corrected merge_collapse function [from the Python implementation of TimSort]" 
> as opposed to the Java implementation which was also discussed.

Yes, I get it now. But, when I read "Python function" or "Python
implementation of <foo>" that _to_me_ refers to someting written in
_Python_.

If you're talking about _CPython_ code, then you say "CPython
function" or "CPython implementation of <foo>".


-- 
Grant Edwards               grant.b.edwards        Yow! I hope the
                                  at               ``Eurythmics'' practice
                              gmail.com            birth control ...

[toc] | [prev] | [next] | [standalone]


#86359

FromChris Angelico <rosuav@gmail.com>
Date2015-02-25 10:33 +1100
Message-ID<mailman.19156.1424820786.18130.python-list@python.org>
In reply to#86345
On Wed, Feb 25, 2015 at 8:34 AM, Roy Smith <roy@panix.com> wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

The post links to:

http://svn.python.org/projects/python/trunk/Objects/listobject.c

Is that still valid and current? It says svn, but does it pop through
to the Mercurial repo where the _real_ CPython code is stored?

ChrisA

[toc] | [prev] | [next] | [standalone]


#86360

FromMRAB <python@mrabarnett.plus.com>
Date2015-02-24 23:38 +0000
Message-ID<mailman.19157.1424821123.18130.python-list@python.org>
In reply to#86345
On 2015-02-24 21:40, Zachary Ware wrote:
> On Tue, Feb 24, 2015 at 3:34 PM, Roy Smith <roy@panix.com> wrote:
>> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>
> http://bugs.python.org/issue23515
>
> Note that the article does mention that Python is not vulnerable due
> to this bug (and best I can tell has no behavioral issues); no
> computer currently in existence can create a large enough array to
> cause a problem.
>
I think the key word here is "currently".

[toc] | [prev] | [next] | [standalone]


#86361

FromSkip Montanaro <skip.montanaro@gmail.com>
Date2015-02-24 17:50 -0600
Message-ID<mailman.19158.1424821845.18130.python-list@python.org>
In reply to#86345
On Tue, Feb 24, 2015 at 5:38 PM, MRAB <python@mrabarnett.plus.com> wrote:
> I think the key word here is "currently".

Sure, but it's not like you have to put out security updates tomorrow.
Even if/when we get to the point where machines can hold an array of
2**49 elements, I suspect people won't be using straight Python to
wrangle them. They will likely use other languages, and if using
Python, will be using something to accelerate things (PyPy, numpy,
etc). Does, for example, numpy's sort() function suffer from the same
flaw? Does PyPy use Timsort?

Skip

[toc] | [prev] | [next] | [standalone]


#86363

FromChris Angelico <rosuav@gmail.com>
Date2015-02-25 11:07 +1100
Message-ID<mailman.19159.1424822880.18130.python-list@python.org>
In reply to#86345
On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
<skip.montanaro@gmail.com> wrote:
> Even if/when we get to the point where machines can hold an array of
> 2**49 elements, I suspect people won't be using straight Python to
> wrangle them.

Looking just at CPython, what is the absolute minimum storage space
for a massive list like that? My guess is that a 64-bit CPython might
get as good as eight bytes per element; playing around with smaller
figures (up to circa a million elements) shows it ranging from 8 to 9
bytes per element, bottoming out at just a smidge over 8, presumably
at the moments when there's no slack space (there's a fixed overhead,
which gets diminishingly significant). So an array of 2**49 elements
would take 2**52 bytes (4 petabytes) of storage - addressable, to be
sure, but an awful lot to be sorting.

Would it be sufficient to stick a comment into the source saying "This
may have problems with lists in excess of 2**49 elements", and leave
it until that's actually likely to happen?

ChrisA

[toc] | [prev] | [next] | [standalone]


#86389

FromPaddy <paddy3118@gmail.com>
Date2015-02-25 01:10 -0800
Message-ID<a61e6bac-fa4b-4f97-bb29-ee665578d1f1@googlegroups.com>
In reply to#86363
On Wednesday, 25 February 2015 00:08:32 UTC, Chris Angelico  wrote:
> On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
> <skip.mon--@gmail.com> wrote:
> > Even if/when we get to the point where machines can hold an array of
> > 2**49 elements, I suspect people won't be using straight Python to
> > wrangle them.
> 
<<snip>>
> 
> Would it be sufficient to stick a comment into the source saying "This
> may have problems with lists in excess of 2**49 elements", and leave
> it until that's actually likely to happen?
> 
> ChrisA

If we are given a proven fix with little downside then if we are to touch the source then we should MAKE THE FIX. To do otherwise would send the wrong signal to those that depend on the perceived integrity of the C-Python developers.

[toc] | [prev] | [next] | [standalone]


#86391

FromChris Angelico <rosuav@gmail.com>
Date2015-02-25 21:11 +1100
Message-ID<mailman.19178.1424859125.18130.python-list@python.org>
In reply to#86389
On Wed, Feb 25, 2015 at 8:10 PM, Paddy <paddy3118@gmail.com> wrote:
> If we are given a proven fix *with little downside* then if we are to touch the source then we should MAKE THE FIX.

(emphasis mine)

That's really the question, though. I'm in no position to evaluate the
patch and ascertain whether the fix does indeed have little or no
downside. But as the tracker issue proves, a greater mind than mine
has decided that it's to be applied, which means that, yes, it's worth
doing.

ChrisA

[toc] | [prev] | [next] | [standalone]


#86364

FromChris Kaynor <ckaynor@zindagigames.com>
Date2015-02-24 16:38 -0800
Message-ID<mailman.19160.1424824756.18130.python-list@python.org>
In reply to#86345

[Multipart message — attachments visible in raw view] — view raw

On Tue, Feb 24, 2015 at 4:07 PM, Chris Angelico <rosuav@gmail.com> wrote:

> On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
> <skip.montanaro@gmail.com> wrote:
> > Even if/when we get to the point where machines can hold an array of
> > 2**49 elements, I suspect people won't be using straight Python to
> > wrangle them.
>
> Looking just at CPython, what is the absolute minimum storage space
> for a massive list like that? My guess is that a 64-bit CPython might
> get as good as eight bytes per element; playing around with smaller
> figures (up to circa a million elements) shows it ranging from 8 to 9
> bytes per element, bottoming out at just a smidge over 8, presumably
> at the moments when there's no slack space (there's a fixed overhead,
> which gets diminishingly significant). So an array of 2**49 elements
> would take 2**52 bytes (4 petabytes) of storage - addressable, to be
> sure, but an awful lot to be sorting.
>
> Would it be sufficient to stick a comment into the source saying "This
> may have problems with lists in excess of 2**49 elements", and leave
> it until that's actually likely to happen?
>

I would agree that it will be quite a while as it stands before the bug
appears, so documenting it MAY be appropriate, as it is likely to be far
enough in the future that a number of unforeseen things may block the bug
from being an issue (TimSort may be replaced by something better, CPython
could die, etc).

That said, from what I understand, their proposed fix would be reasonably
easy to use, and have minimal cost, so it may be worthwhile to use just
because the issue is likely to be in use for quite a while before somebody
remembers the issue exists. Once somebody does hit it, it may take a while
to realize that TimSort is at fault, as it will only occur on some data
sets (my understanding is that the elements must be in certain
configurations with at-least the minimum number), so it will likely appear
as just a random crash occurring.

If it were fully up to me, I would do one of the following, in order of
preference:
1) Implement and test the fix provided on the blog. This requires the most
work due to licensing, reviewing, and testing.
2) Add a noisy warning/error to the code when sorting lists large enough to
potentially hit the error (probably 2**48 to leave some wiggle room). This
is much easier to do, but merely pushes the can down the road to be dealt
with later. It does still make it obvious what is going long when it
finally breaks.
3) Add a comment explaining the issue (likely linking back to the blog).
While by far the easiest "solution", again, this merely pushes the can down
the road, and it may be very non-obvious what went wrong when it breaks,
despite the comment.

Chris

[toc] | [prev] | [next] | [standalone]


#86371

FromTerry Reedy <tjreedy@udel.edu>
Date2015-02-24 21:52 -0500
Message-ID<mailman.19165.1424832769.18130.python-list@python.org>
In reply to#86345
On 2/24/2015 4:34 PM, Roy Smith wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

Tim Peters is aware of this and opened http://bugs.python.org/issue23515


-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#86372

FromDave Angel <davea@davea.name>
Date2015-02-24 21:41 -0500
Message-ID<mailman.19164.1424832117.18130.python-list@python.org>
In reply to#86345
On 02/24/2015 07:07 PM, Chris Angelico wrote:
> On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
> <skip.montanaro@gmail.com> wrote:
>> Even if/when we get to the point where machines can hold an array of
>> 2**49 elements, I suspect people won't be using straight Python to
>> wrangle them.
>
> Looking just at CPython, what is the absolute minimum storage space
> for a massive list like that? My guess is that a 64-bit CPython might
> get as good as eight bytes per element; playing around with smaller
> figures (up to circa a million elements) shows it ranging from 8 to 9
> bytes per element, bottoming out at just a smidge over 8, presumably
> at the moments when there's no slack space (there's a fixed overhead,
> which gets diminishingly significant). So an array of 2**49 elements
> would take 2**52 bytes (4 petabytes) of storage - addressable, to be
> sure, but an awful lot to be sorting.
>
> Would it be sufficient to stick a comment into the source saying "This
> may have problems with lists in excess of 2**49 elements", and leave
> it until that's actually likely to happen?
>
> ChrisA
>

One could do much better than that.  Put in a test for the length of the 
list, and if it's too large for the current implementation, throw an 
exception.  Much better than a comment.

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#86388

FromNed Deily <nad@acm.org>
Date2015-02-25 01:04 -0800
Message-ID<mailman.19176.1424855099.18130.python-list@python.org>
In reply to#86345
In article 
<CAPTjJmpQw-23mzgYHFjUC+y-C=vVm1dKePSpjG9+J3H8tHO7dA@mail.gmail.com>,
 Chris Angelico <rosuav@gmail.com> wrote:
> The post links to:
> 
> http://svn.python.org/projects/python/trunk/Objects/listobject.c
> 
> Is that still valid and current? It says svn, but does it pop through
> to the Mercurial repo where the _real_ CPython code is stored?

In general, no.  The python svn repo linked to above is frozen in time: 
for most branches, as of the conversion to hg in 2011.  There was at 
least one discussion thread soon thereafter about what to do about it:

http://comments.gmane.org/gmane.comp.python.devel/125252

It is still true that at least one repo in the svn.python.org is 
actively maintained (e.g. "external").  I *think* all the others are now 
obsolete.

http://svn.python.org/view/

-- 
 Ned Deily,
 nad@acm.org

[toc] | [prev] | [next] | [standalone]


#86398

FromSturla Molden <sturla.molden@gmail.com>
Date2015-02-25 14:58 +0100
Message-ID<mailman.19183.1424872728.18130.python-list@python.org>
In reply to#86345
On 24/02/15 22:34, Roy Smith wrote:
> http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
>

This is awful. It is broken for arrays longer than 2**49 elements. With 
8 bytes per PyObject* pointer this is >4096 terabytes of RAM. I don't 
see how we can fix this in time.

Oh yes, and they mention that TimSort is used on billions of devices due 
to Android mobile phones. This is clearly very relevant for mobile 
phones. Next thing you know your litte Samsung Galaxy with more than 
4096 terabytes breaks down from a stack overflow in TimSort.


Sturla

[toc] | [prev] | [next] | [standalone]


Page 1 of 3  [1] 2 3  Next page →

Back to top | Article view | comp.lang.python


csiph-web