Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #48355 > unrolled thread

Timsort in Cpython

Started byalphonse23@gmail.com
First post2013-06-15 12:44 -0700
Last post2013-06-19 12:20 -0600
Articles 15 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  Timsort in Cpython alphonse23@gmail.com - 2013-06-15 12:44 -0700
    Re: Timsort in Cpython Zachary Ware <zachary.ware+pylist@gmail.com> - 2013-06-15 14:55 -0500
    Re: Timsort in Cpython Robert Kern <robert.kern@gmail.com> - 2013-06-15 21:00 +0100
    Re: Timsort in Cpython alphonse23@gmail.com - 2013-06-15 13:21 -0700
      Re: Timsort in Cpython Robert Kern <robert.kern@gmail.com> - 2013-06-15 21:55 +0100
        Re: Timsort in Cpython alphonse23@gmail.com - 2013-06-15 14:43 -0700
      Re: Timsort in Cpython Terry Reedy <tjreedy@udel.edu> - 2013-06-15 22:56 -0400
    Re: Timsort in Cpython alphonse23@gmail.com - 2013-06-15 21:05 -0700
      Re: Timsort in Cpython mm0fmf <none@mailinator.com> - 2013-06-16 14:08 +0100
        Re: Timsort in Cpython alphonse23@gmail.com - 2013-06-16 09:15 -0700
          Re: Timsort in Cpython Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-06-16 16:16 -0400
            Re: Timsort in Cpython sean.westfall@gmail.com - 2013-06-19 10:10 -0700
      Re: Timsort in Cpython Ian Kelly <ian.g.kelly@gmail.com> - 2013-06-16 14:33 -0600
        Re: Timsort in Cpython sean.westfall@gmail.com - 2013-06-19 10:18 -0700
          Re: Timsort in Cpython Ian Kelly <ian.g.kelly@gmail.com> - 2013-06-19 12:20 -0600

#48355 — Timsort in Cpython

Fromalphonse23@gmail.com
Date2013-06-15 12:44 -0700
SubjectTimsort in Cpython
Message-ID<9ca984d5-19d9-4e82-a305-2a2f5ee341fe@googlegroups.com>
I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c

So of all the functions in there, could somebody point to me which one is timsort?

Thanks, if anyone can help.
Alphonse23

[toc] | [next] | [standalone]


#48359

FromZachary Ware <zachary.ware+pylist@gmail.com>
Date2013-06-15 14:55 -0500
Message-ID<mailman.3402.1371326146.3114.python-list@python.org>
In reply to#48355
On Sat, Jun 15, 2013 at 2:44 PM,  <alphonse23@gmail.com> wrote:
> I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c
>
> So of all the functions in there, could somebody point to me which one is timsort?
>
> Thanks, if anyone can help.
> Alphonse23

Actually, it looks to me like it's several of them, but the main
function is here:
http://hg.python.org/cpython/file/default/Objects/listobject.c#l1902.

HTH,

Zach

[toc] | [prev] | [next] | [standalone]


#48360

FromRobert Kern <robert.kern@gmail.com>
Date2013-06-15 21:00 +0100
Message-ID<mailman.3403.1371326444.3114.python-list@python.org>
In reply to#48355
On 2013-06-15 20:44, alphonse23@gmail.com wrote:
> I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c
>
> So of all the functions in there, could somebody point to me which one is timsort?

listsort()

http://hg.python.org/cpython/file/default/Objects/listobject.c#l1896

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#48363

Fromalphonse23@gmail.com
Date2013-06-15 13:21 -0700
Message-ID<21a6a262-765b-4771-809e-56a9808d9cf9@googlegroups.com>
In reply to#48355
Hey guys,
Thanks for the quick reply! So why did they decide to call it listsort in the source instead? Why didn't they keep it as Timsort?

Well. I'm going to have a ton of fun trying to make sense of this.

[toc] | [prev] | [next] | [standalone]


#48369

FromRobert Kern <robert.kern@gmail.com>
Date2013-06-15 21:55 +0100
Message-ID<mailman.3408.1371329739.3114.python-list@python.org>
In reply to#48363
On 2013-06-15 21:21, alphonse23@gmail.com wrote:
> Hey guys,
> Thanks for the quick reply! So why did they decide to call it listsort in the source instead? Why didn't they keep it as Timsort?

This was the first implementation of the algorithm. The algorithm was only 
colloquially named "Timsort" after it was used in Python.

This the naming convention for the C implementation of builtin types' methods in 
the Python codebase. The C implementation listsort() corresponds with the Python 
method list.sort(). Similarly, listappend() is list.append(), listpop() is 
list.pop(), etc. C.f.

http://hg.python.org/cpython/file/default/Objects/listobject.c#l2362

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#48374

Fromalphonse23@gmail.com
Date2013-06-15 14:43 -0700
Message-ID<12db6fee-12ab-455d-970a-f0fd5144fcac@googlegroups.com>
In reply to#48369
Ahh, that makes perfect sense. Thanks for clearing that up.

On Saturday, June 15, 2013 1:55:29 PM UTC-7, Robert Kern wrote:
> On 2013-06-15 21:21, alphonse23@gmail.com wrote:
> 
> > Hey guys,
> 
> > Thanks for the quick reply! So why did they decide to call it listsort in the source instead? Why didn't they keep it as Timsort?
> 
> 
> 
> This was the first implementation of the algorithm. The algorithm was only 
> 
> colloquially named "Timsort" after it was used in Python.
> 
> 
> 
> This the naming convention for the C implementation of builtin types' methods in 
> 
> the Python codebase. The C implementation listsort() corresponds with the Python 
> 
> method list.sort(). Similarly, listappend() is list.append(), listpop() is 
> 
> list.pop(), etc. C.f.
> 
> 
> 
> http://hg.python.org/cpython/file/default/Objects/listobject.c#l2362
> 
> 
> 
> -- 
> 
> Robert Kern
> 
> 
> 
> "I have come to believe that the whole world is an enigma, a harmless enigma
> 
>   that is made terrible by our own mad attempt to interpret it as though it had
> 
>   an underlying truth."
> 
>    -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#48404

FromTerry Reedy <tjreedy@udel.edu>
Date2013-06-15 22:56 -0400
Message-ID<mailman.3424.1371351419.3114.python-list@python.org>
In reply to#48363
On 6/15/2013 4:21 PM, alphonse23@gmail.com wrote:

> Well. I'm going to have a ton of fun trying to make sense of this.

http://hg.python.org/cpython/file/default/Objects/listsort.txt
is pretty clear (to me) for most of the basics.


-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#48407

Fromalphonse23@gmail.com
Date2013-06-15 21:05 -0700
Message-ID<b3df2c76-b191-4cbb-9859-874b534a17e5@googlegroups.com>
In reply to#48355
Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.

Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb. 

listsort(PyListObject *self, PyObject *args, PyObject *kwds)

I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.

On Saturday, June 15, 2013 12:44:01 PM UTC-7, alpho...@gmail.com wrote:
> I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c
> 
> 
> 
> So of all the functions in there, could somebody point to me which one is timsort?
> 
> 
> 
> Thanks, if anyone can help.
> 
> Alphonse23

[toc] | [prev] | [next] | [standalone]


#48448

Frommm0fmf <none@mailinator.com>
Date2013-06-16 14:08 +0100
Message-ID<tNivt.54379$Gm5.38341@fx11.am4>
In reply to#48407
Answer: The lost context.

Question: What makes top-posted replies harder to read than bottom-posted?



alphonse23@gmail.com wrote:
> Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.
> 
> Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb. 
> 
> listsort(PyListObject *self, PyObject *args, PyObject *kwds)
> 
> I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.
> 
> On Saturday, June 15, 2013 12:44:01 PM UTC-7, alpho...@gmail.com wrote:
>> I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c
>>
>>
>>
>> So of all the functions in there, could somebody point to me which one is timsort?
>>
>>
>>
>> Thanks, if anyone can help.
>>
>> Alphonse23

[toc] | [prev] | [next] | [standalone]


#48460

Fromalphonse23@gmail.com
Date2013-06-16 09:15 -0700
Message-ID<0aa2bd06-d1ae-446b-9c85-917f43af4db2@googlegroups.com>
In reply to#48448
sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c

I was replying to Terry Jan Reedy
>
> http://hg.python.org/cpython/file/default/Objects/listsort.txt
> 
> is pretty clear (to me) for most of the basics.


> Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb. 
> 
> listsort(PyListObject *self, PyObject *args, PyObject *kwds) 
> 
> I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut. 
> 



On Sunday, June 16, 2013 6:08:50 AM UTC-7, mm0fmf wrote:
> Answer: The lost context.
> 
> 
> 
> Question: What makes top-posted replies harder to read than bottom-posted?
> 
> 
> 
> 
> 
> 
> 
> alphonse23@gmail.com wrote:
> 
> > Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.
> 
> > 
> 
> > Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in 
variables into listsort and watch timsort work line by line in gdb. 
> 
> > 
> 
> > listsort(PyListObject *self, PyObject *args, PyObject *kwds)
> 
> > 
> 
> > I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.
> 
> > 
> 
> > On Saturday, June 15, 2013 12:44:01 PM UTC-7, alpho...@gmail.com wrote:
> 
> >> I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c
> 
> >>
> 
> >>
> 
> >>
> 
> >> So of all the functions in there, could somebody point to me which one is timsort?
> 
> >>
> 
> >>
> 
> >>
> 
> >> Thanks, if anyone can help.
> 
> >>
> 
> >> Alphonse23

[toc] | [prev] | [next] | [standalone]


#48478

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-06-16 16:16 -0400
Message-ID<mailman.3457.1371413775.3114.python-list@python.org>
In reply to#48460
On Sun, 16 Jun 2013 09:15:11 -0700 (PDT), alphonse23@gmail.com declaimed
the following:

>sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c
>
	Since you are new to GoogleGroups, if you can, run away from it as fast
as possible. While material may look okay on their system, it is
practically trashed when getting out into the real world. (Paragraphs
either come in as long lines with no wrapping [Usenet/Email convention is
for 80 character lines, and to allow for > quotes original text should wrap
around 72-75], or they end up double-spacing stuff that comes in following
the 80 character line length (GG is treating hard end-of-line as a
paragraph marker, and on quoting, adding a blank line between these
"paragraphs")

	Either subscribe to the mailing list (and use a real mail client rather
than web-mail) or use a news reader; if your ISP doesn't provide an NNTP
server carrying comp.lang.python, it is available from Gmane as
gmane.comp.python.general (the mailing list and comp.lang.python are cross
linked, and Gmane shows the mailing list as if it were a Usenet news group)

	And just another comment: preference on the group is "trim quoted
material and comment below the quote (or interspersed with the quotes)"...
The style created with M$ Outlook (Outlook goes out of its way to make it
impossible to trim/intersperse -- it considers quoted material as a
photocopy attached to the back of a new letter) in which one comments at
the top of the quoted material, and never trims to relevant material is
frowned upon.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#48734

Fromsean.westfall@gmail.com
Date2013-06-19 10:10 -0700
Message-ID<9c108f77-2a83-4fb5-89f5-d895357dffc4@googlegroups.com>
In reply to#48478
On Sunday, June 16, 2013 1:16:02 PM UTC-7, Dennis Lee Bieber wrote:
> On Sun, 16 Jun 2013 09:15:11 -0700 (PDT), alphonse23@gmail.com declaimed
> 
> the following:
> 
> 
> 
> >sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c
> 
> >
> 
> 	Since you are new to GoogleGroups, if you can, run away from it as fast
> 
> as possible. While material may look okay on their system, it is
> 
> practically trashed when getting out into the real world. (Paragraphs
> 
> either come in as long lines with no wrapping [Usenet/Email convention is
> 
> for 80 character lines, and to allow for > quotes original text should wrap
> 
> around 72-75], or they end up double-spacing stuff that comes in following
> 
> the 80 character line length (GG is treating hard end-of-line as a
> 
> paragraph marker, and on quoting, adding a blank line between these
> 
> "paragraphs")
> 
> 
> 
> 	Either subscribe to the mailing list (and use a real mail client rather
> 
> than web-mail) or use a news reader; if your ISP doesn't provide an NNTP
> 
> server carrying comp.lang.python, it is available from Gmane as
> 
> gmane.comp.python.general (the mailing list and comp.lang.python are cross
> 
> linked, and Gmane shows the mailing list as if it were a Usenet news group)
> 
> 
> 
> 	And just another comment: preference on the group is "trim quoted
> 
> material and comment below the quote (or interspersed with the quotes)"...
> 
> The style created with M$ Outlook (Outlook goes out of its way to make it
> 
> impossible to trim/intersperse -- it considers quoted material as a
> 
> photocopy attached to the back of a new letter) in which one comments at
> 
> the top of the quoted material, and never trims to relevant material is
> 
> frowned upon.
> 

Understood. I just subscribed to the news group, though I'm using gmail. hopefully it will work better than google groups.

I'll keep all the advice here in mind whenever I post.

> -- 
> 
> 	Wulfraed                 Dennis Lee Bieber         AF6VN
> 
>     wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#48480

FromIan Kelly <ian.g.kelly@gmail.com>
Date2013-06-16 14:33 -0600
Message-ID<mailman.3459.1371415158.3114.python-list@python.org>
In reply to#48407
On Sat, Jun 15, 2013 at 10:05 PM,  <alphonse23@gmail.com> wrote:
> Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.
>
> Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb.
>
> listsort(PyListObject *self, PyObject *args, PyObject *kwds)
>
> I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.

A PyObject* generally references any Python object.  The subtype
PyListObject* more specifically references a Python list.  The above
signature corresponds to this Python function signature:

    def listsort(self, *args, **kwds):

The first argument self is the list object to be operated on.  The
second argument args is a Python tuple containing any other positional
arguments that were passed into the method.  The third argument kwds
is a Python dict containing any keyword arguments that were passed
into the method.

[toc] | [prev] | [next] | [standalone]


#48736

Fromsean.westfall@gmail.com
Date2013-06-19 10:18 -0700
Message-ID<87a79a3d-6f06-4aa3-a09f-48c695adeb4a@googlegroups.com>
In reply to#48480
On Sunday, June 16, 2013 1:33:17 PM UTC-7, Ian wrote:
> On Sat, Jun 15, 2013 at 10:05 PM,  <alphonse23@gmail.com> wrote:
> 
> > Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.
> 
> >
> 
> > Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb.
> 
> >
> 
> > listsort(PyListObject *self, PyObject *args, PyObject *kwds)
> 
> >
> 
> > I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.
> 
> 
> 
> A PyObject* generally references any Python object.  The subtype
> 
> PyListObject* more specifically references a Python list.  The above
> 
> signature corresponds to this Python function signature:
> 
> 
> 
>     def listsort(self, *args, **kwds):
> 
> 
> 
> The first argument self is the list object to be operated on.  The
> 
> second argument args is a Python tuple containing any other positional
> 
> arguments that were passed into the method.  The third argument kwds
> 
> is a Python dict containing any keyword arguments that were passed
> 
> into the method.

The second argument takes the tuple which determines which varialble(key) to use the comparator on. And the third determines whether to return the list in ascending or descending order. But how do these PyObject* look in C?

How does a PyListObject* look declared in CPython. How would something like this list = [2, 1, 5, 6, 10] look in CPython. Or what about something more complicated -- mlist = [('A',1),('B',2),('C',3)].

Thanks for the help.

[toc] | [prev] | [next] | [standalone]


#48739

FromIan Kelly <ian.g.kelly@gmail.com>
Date2013-06-19 12:20 -0600
Message-ID<mailman.3594.1371666099.3114.python-list@python.org>
In reply to#48736
On Wed, Jun 19, 2013 at 11:18 AM,  <sean.westfall@gmail.com> wrote:
> The second argument takes the tuple which determines which varialble(key) to use the comparator on. And the third determines whether to return the list in ascending or descending order.

That's not exactly correct.  The arguments are listed in that order,
but in fact the arguments to list.sort are keyword-only and cannot be
supplied positionally.  So the "args" argument is expected to be an
empty tuple, and the "kwds" argument is a dict that contains both the
"key" and "reverse" arguments, if they were supplied.

> But how do these PyObject* look in C?

It's a pointer to a struct that contains information like the class
and reference count of the object.

> How does a PyListObject* look declared in CPython.

That's a pointer to a larger struct that shares the same header as the
PyObject* struct (which is basically how you do inheritance in C).  It
adds information like the length and capacity of the list, plus a
pointer to an array of PyObject* that stores the contents of the list.

> How would something like this list = [2, 1, 5, 6, 10] look in CPython. Or what about something more complicated -- mlist = [('A',1),('B',2),('C',3)].

To answer that question, you should really delve into the source and
see what these structs actually look like.  But the first is going to
contain an array of five PyObject* values, each of which references an
int, while the second is going to contain an array of three PyObject*
values, each of which references a tuple.

I also recommend that you read the sections of the Python docs that
cover the C API, as those should help you understand how these structs
are normally handled.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web