Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #67151 > unrolled thread

Tuples and immutability

Started byEric Jacoboni <eric.jacoboni@gmail.com>
First post2014-02-27 17:01 +0100
Last post2014-03-10 07:06 +1100
Articles 20 on this page of 161 — 33 participants

Back to article view | Back to comp.lang.python


Contents

  Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:01 +0100
    Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:13 -0600
      Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:27 +0100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:33 +1100
        Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:47 -0600
        Re: Tuples and immutability Nick Timkovich <prometheus235@gmail.com> - 2014-02-27 15:47 -0600
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 08:52 +1100
        Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-27 15:18 -0700
          Re: Tuples and immutability Rick Johnson <rantingrickjohnson@gmail.com> - 2014-03-11 19:01 -0700
            Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-12 13:10 +1100
            Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-11 23:25 -0400
              Re: Tuples and immutability Steven D'Aprano <steve@pearwood.info> - 2014-03-12 06:28 +0000
                Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:39 +0100
                Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:40 -0600
                Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:51 -0600
                Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 12:21 +0100
            Re: Tuples and immutability Ethan Furman <ethan@stoneleaf.us> - 2014-03-11 23:32 -0700
            Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-12 10:20 +0000
        Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-01 18:55 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:14 +1100
    Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-27 18:19 +0200
      Re: Tuples and immutability John O'Hagan <research@johnohagan.com> - 2014-02-28 16:17 +1100
        Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-28 09:54 +0200
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-02-28 14:41 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 01:43 +1100
      Re: Tuples and immutability Duncan Booth <duncan.booth@invalid.invalid> - 2014-03-07 09:33 +0000
        Re: Tuples and immutability Ben Finney <ben+python@benfinney.id.au> - 2014-03-07 22:04 +1100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:11 +1100
        Re: Tuples and immutability Peter Otten <__peter__@web.de> - 2014-03-07 12:38 +0100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:45 +1100
        Re: Tuples and immutability Alister <alister.ware@ntlworld.com> - 2014-03-07 11:51 +0000
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 11:23 -0700
        Re: Tuples and immutability Roy Smith <roy@panix.com> - 2014-03-07 08:37 -0500
        Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-08 15:17 +1300
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 21:17 -0700
            Balanced trees (was: Re: Tuples and immutability) Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 10:34 +0200
              Re: Balanced trees (was: Re: Tuples and immutability) Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 04:03 -0700
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 13:26 +0200
              Re: Balanced trees (was: Re: Tuples and immutability) Dan Stromberg <drsalists@gmail.com> - 2014-03-08 11:58 -0800
              Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-08 20:37 +0000
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 23:21 +0200
                  Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-08 17:22 -0500
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:17 +0200
                  Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-08 17:31 -0800
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:27 +0200
                      Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:20 -0700
                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:32 +0200
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:37 -0700
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:43 +0200
                              Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 15:08 -0700
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 00:24 +0200
                              Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 18:04 -0600
                                Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
                          Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 08:16 +0200
                              Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 08:53 +0000
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 11:41 +0200
                                  Re: Balanced trees Ned Batchelder <ned@nedbatchelder.com> - 2014-03-10 06:57 -0400
                                    Re: Balanced trees Rustom Mody <rustompmody@gmail.com> - 2014-03-10 09:01 -0700
                                      Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 02:02 +0000
                                        Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:20 -0400
                                          Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 13:29 +1100
                                Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-10 09:51 +0000
                              Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 09:59 -0400
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:20 +1100
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:24 +1100
                                  Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:08 +0200
                                    Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 04:17 +1100
                                      Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:34 +0200
                                        Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-13 12:40 -0600
                                        Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-13 23:57 +0000
                                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-13 20:12 -0700
                                            Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-14 05:02 +0000
                                    Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 19:24 -0400
                                      Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 10:27 +1100
                                      Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 01:26 +0000
                                        Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 12:45 +1100
                                        Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 21:38 -0600
                                        Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 15:28 +1100
                                          Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-12 00:57 +0000
                                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-11 12:12 +0200
                                          Re: Balanced trees alex23 <wuwei23@gmail.com> - 2014-03-12 10:13 +1000
                                      Re: Balanced trees Alister <alister.ware@ntlworld.com> - 2014-03-11 09:25 +0000
                                      Re: Balanced trees Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:08 +0100
                                Re: Balanced trees Tim Chase <python.list@tim.thechases.com> - 2014-03-10 11:33 -0500
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:39 +1100
                                Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-10 18:05 -0700
                                  Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:13 -0400
                            Re: Balanced trees Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-03-10 19:57 -0400
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-15 01:13 +0000
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 00:05 +0200
                  Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 12:26 -0700
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 22:55 +0200
                      Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 14:45 -0700
                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 00:03 +0200
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 15:21 -0700
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 01:11 +0200
                              Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 10:49 +0200
                                  Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 13:42 +0000
                                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 15:54 +0200
                                    Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-19 10:06 -0400
                            Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
                              Re: Balanced trees Ethan Furman <ethan@stoneleaf.us> - 2014-03-19 08:15 -0700
                        Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-20 02:16 +0000
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-19 19:34 -0700
                      Re: Balanced trees Chris Kaynor <ckaynor@zindagigames.com> - 2014-03-18 18:02 -0700
                  Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-18 13:18 -0700
              blist in standard library  (was Re: Balanced trees) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-15 12:31 +0000
              Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 14:16 -0700
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 00:08 +0000
              Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 18:01 -0700
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 07:46 +0000
            Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:40 +1300
              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:39 -0700
                Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:35 +0200
                Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-10 11:03 +1300
                  Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-09 19:00 -0400
                  Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 17:42 -0600
                    Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 02:37 +0000
                      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 02:35 -0600
                        Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 09:13 +0000
                        Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:15 +1300
                    Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:03 +1300
                      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 04:39 -0600
                        Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 16:46 +0000
                          Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-12 10:23 +1300
                          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 17:06 -0600
                            Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-12 23:20 +0000
                              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 19:35 -0600
                              Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-12 22:09 -0400
            Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:45 +1300
              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:55 -0700
    Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 16:22 -0800
      Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-01 02:27 +0100
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 20:45 -0800
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:34 -0700
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 21:50 -0800
              Re: Tuples and immutability Ned Batchelder <ned@nedbatchelder.com> - 2014-03-01 12:56 -0500
        Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:26 -0700
      Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 12:39 +1100
      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:16 -0700
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:16 -0800
          Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 17:29 +1100
          Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-01 14:54 +0000
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:01 -0800
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:25 -0800
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-01 12:45 -0700
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:21 -0800
          Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 03:04 +0100
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:28 -0800
            Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-02 05:32 -0700
              Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 14:38 +0100
                Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-02 14:05 +0000
                  Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 15:17 +0100
                    Re: Tuples and immutability "albert visser" <albert.visser@gmail.com> - 2014-03-02 15:37 +0100
          Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:15 -0800
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 17:54 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 05:13 +1100
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 19:57 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 07:06 +1100

Page 5 of 9 — ← Prev page 1 2 3 4 [5] 6 7 8 9  Next page →


#68206 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-11 12:12 +0200
SubjectRe: Balanced trees
Message-ID<87wqg1oxh4.fsf@elektro.pacujo.net>
In reply to#68187
Steven D'Aprano <steve+comp.lang.python@pearwood.info>:

> On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote:
>
>> In article <8761nmrnfk.fsf@elektro.pacujo.net>,
>>  Marko Rauhamaa <marko@pacujo.net> wrote:
>> 
>>> Anyway, this whole debate is rather unnecessary since every developer
>>> is supposed to have both weapons in their arsenal.
>> 
>> The problem with having a choice is that it opens up the possibility of
>> making the wrong one :-)
>
> [...]
>
> In my experience, the average developer has an amazing talent for 
> pessimising code when they think they are optimising it.

Java is the straitjacket that prevents the "average developer" from
shooting himself in the foot.

Python should let skilled professionals do their work. Thankfully, for
the most part, it does.


Marko

[toc] | [prev] | [next] | [standalone]


#68253 — Re: Balanced trees

Fromalex23 <wuwei23@gmail.com>
Date2014-03-12 10:13 +1000
SubjectRe: Balanced trees
Message-ID<lfo8qf$9sm$1@dont-email.me>
In reply to#68206
On 11/03/2014 8:12 PM, Marko Rauhamaa wrote:
> Python should let skilled professionals do their work. Thankfully, for
> the most part, it does.

Skilled professionals don't solely rely on the standard library, either. 
If you know you need a balanced tree, you'll also know where to find an 
implementation of one.

[toc] | [prev] | [next] | [standalone]


#68205 — Re: Balanced trees

FromAlister <alister.ware@ntlworld.com>
Date2014-03-11 09:25 +0000
SubjectRe: Balanced trees
Message-ID<ODATu.44616$a74.40039@fx20.am4>
In reply to#68179
On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote:

> In article <8761nmrnfk.fsf@elektro.pacujo.net>,
>  Marko Rauhamaa <marko@pacujo.net> wrote:
> 
>> Anyway, this whole debate is rather unnecessary since every developer
>> is supposed to have both weapons in their arsenal.
> 
> The problem with having a choice is that it opens up the possibility of
> making the wrong one :-)
> 
> As this discussion has shown, figuring out whether a hash table or a
> tree is better for a given problem is non-trivial.  My guess is that if
> you gave 1000 typical developers both data structures and let them pick
> freely, the number of cases where it really mattered and the developer
> picked the right one would be approximately equal to the number of cases
> where they picked the wrong one.

Perhaps you are not familiar with the 50/50/90 rule?

Given any 50/50 choice you are 90% certain to pick the wrong one :-)



-- 
I hold it, that a little rebellion, now and then, is a good thing...
		-- Thomas Jefferson

[toc] | [prev] | [next] | [standalone]


#68263 — Re: Balanced trees

FromAntoon Pardon <antoon.pardon@rece.vub.ac.be>
Date2014-03-12 10:08 +0100
SubjectRe: Balanced trees
Message-ID<mailman.8075.1394615335.18130.python-list@python.org>
In reply to#68179
Op 11-03-14 00:24, Roy Smith schreef:

> In article <8761nmrnfk.fsf@elektro.pacujo.net>,
>  Marko Rauhamaa <marko@pacujo.net> wrote:
>
>> Anyway, this whole debate is rather unnecessary since every developer is
>> supposed to have both weapons in their arsenal.
> The problem with having a choice is that it opens up the possibility of 
> making the wrong one :-)

This is just a standard defense for the status quo. Introducing the decimal
module also added a choice.

> As this discussion has shown, figuring out whether a hash table or a 
> tree is better for a given problem is non-trivial.  My guess is that if 
> you gave 1000 typical developers both data structures and let them pick 
> freely, the number of cases where it really mattered and the developer 
> picked the right one would be approximately equal to the number of cases 
> where they picked the wrong one.

You are only illustrating one part. How about all those cases now where the
wrong choice is more or less forced on the developer for lack of the alternative?

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]


#68159 — Re: Balanced trees

FromTim Chase <python.list@tim.thechases.com>
Date2014-03-10 11:33 -0500
SubjectRe: Balanced trees
Message-ID<mailman.8009.1394469217.18130.python-list@python.org>
In reply to#68143
On 2014-03-11 03:24, Chris Angelico wrote:
> Imagine, worst case, all one million records have the same
> song/user/add_time and you need to do twenty comparisons involving
> four fields. That's gotta be worse than one hashing of five fields.

And if you have one million songs that are indistinguishable, you
have Pop Top-40 playlists.  And that is *definitely* worse than
hashes OR trees.  :-)

-tkc

[toc] | [prev] | [next] | [standalone]


#68161 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 03:39 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8011.1394469568.18130.python-list@python.org>
In reply to#68143
On Tue, Mar 11, 2014 at 3:33 AM, Tim Chase
<python.list@tim.thechases.com> wrote:
> On 2014-03-11 03:24, Chris Angelico wrote:
>> Imagine, worst case, all one million records have the same
>> song/user/add_time and you need to do twenty comparisons involving
>> four fields. That's gotta be worse than one hashing of five fields.
>
> And if you have one million songs that are indistinguishable, you
> have Pop Top-40 playlists.  And that is *definitely* worse than
> hashes OR trees.  :-)

LOL!

That's not a Comp Sci problem, I'm afraid. But there is a solution:

https://www.youtube.com/watch?v=jCZzvZEmq_A
https://www.youtube.com/watch?v=DEr_5C6JYB4

A different sort of music.

ChrisA

[toc] | [prev] | [next] | [standalone]


#68186 — Re: Balanced trees

FromDan Stromberg <drsalists@gmail.com>
Date2014-03-10 18:05 -0700
SubjectRe: Balanced trees
Message-ID<mailman.8031.1394499924.18130.python-list@python.org>
In reply to#68143
On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <roy@panix.com> wrote:
> On the other hand, log n, for n = 1 million, is just 20.  It's not hard
> to imagine a hash function which costs 20x what a node traversal does,
> in which case, the log n lookup is ahead for all n < 1 million.

FWIW, both the hash table and the tree will have constants.  So a tree
would be c*20 in its most significant term, and the hash table would
be d*1 in its.  The real-world performance depends quite a bit on
those constants at small values of n.  I don't really consider a
million all that big, but the meaning of "big" of course depends.

[toc] | [prev] | [next] | [standalone]


#68191 — Re: Balanced trees

FromRoy Smith <roy@panix.com>
Date2014-03-10 22:13 -0400
SubjectRe: Balanced trees
Message-ID<roy-19EAAB.22135410032014@news.panix.com>
In reply to#68186
In article <mailman.8031.1394499924.18130.python-list@python.org>,
 Dan Stromberg <drsalists@gmail.com> wrote:

> On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <roy@panix.com> wrote:
> > On the other hand, log n, for n = 1 million, is just 20.  It's not hard
> > to imagine a hash function which costs 20x what a node traversal does,
> > in which case, the log n lookup is ahead for all n < 1 million.
> 
> FWIW, both the hash table and the tree will have constants.  So a tree
> would be c*20 in its most significant term, and the hash table would
> be d*1 in its.  The real-world performance depends quite a bit on
> those constants at small values of n.  I don't really consider a
> million all that big, but the meaning of "big" of course depends.

Well, the largest disk volume I can configure in AWS is 1 TB.  So, I 
guess we can take that to be "big".

Assuming 1-character strings, and no overhead (both strange assumptions, 
but it makes the math easier), that's 10^12 nodes in our binary tree.  
That's still only 40 layers deep.  Log n is your friend.

[toc] | [prev] | [next] | [standalone]


#68183 — Re: Balanced trees

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2014-03-10 19:57 -0400
SubjectRe: Balanced trees
Message-ID<mailman.8028.1394495825.18130.python-list@python.org>
In reply to#68117
On 10 Mar 2014 03:24:08 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following:

>
>Proof: I create a hash table that accepts unsigned bytes as keys, where 
>the hash is the value of the byte. So byte 0x42 hashes to 0x42, or 
>decimal 66. I give the hash table 256 slots, numbered from 0 to 255. 
>Every hash takes constant time, there are no collisions at all, lookups, 
>insertions and deletions all take constant time regardless of how full 
>the table is. The best, worst and average cases are not just O(1) but 
>also ?(1).
>
	What you've defined, however, can be replaced with straight list
indexing <G> No hash function needed.

	If there is a finite (and small) number of possible keys, the list will
likely beat the hash.

	OTOH, when the keys exceed reasonable list storage, the hash becomes
useful -- but one now has to be concerned with collisions.

	Back in the day, I was somewhat pleased to discover that the Amiga
directory structure was a real world example of something I'd only seen as
a class assignment in my data structures class: a hashed-head
multiple-linked list.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#68364 — Re: Balanced trees

FromJoshua Landau <joshua@landau.ws>
Date2014-03-15 01:13 +0000
SubjectRe: Balanced trees
Message-ID<mailman.8154.1394846042.18130.python-list@python.org>
In reply to#68026
On 8 March 2014 20:37, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> I've found this link useful http://kmike.ru/python-data-structures/
>
> I also don't want all sorts of data structures added to the Python library.
> I believe that there are advantages to leaving specialist data structures on
> pypi or other sites, plus it means Python in a Nutshell can still fit in
> your pocket and not a 40 ton articulated lorry, unlike the Java equivalent.

The thing we really need is for the blist containers to become stdlib
(but not to replace the current list implementation). The rejected PEP
(http://legacy.python.org/dev/peps/pep-3128/) misses a few important
points, largely in how the "log(n)" has a really large base:
random.choice went from 1.2µs to 1.6µs from n=1 to n=10⁸, vs 1.2µs for
a standard list.

Further, it's worth considering a few advantages:

* copy is O(1), allowing code to avoid mutation by just copying its
input, which is good practice.

* FIFO is effectively O(1), as the time just about doubles from n=1 to
n=10⁸ so will never actually branch that much. There is still a speed
benefit of collections.deque, but it's much, much less significant.
This is very useful when considering usage as a multi-purpose data
structure, and removes demand for explicit linked lists (which have
foolishly been reimplemented loads of times).

* It reduces demand for trees:
    * There are efficient implementations of sortedlist, sortedset and
sorteddict.
    * Slicing, slice assignment and slice deletion are really fast.
    * Addition of lists is sublinear. Instead of
"list(itertools.chain(...))", one can add in a loop and end up
*faster*.

I think blist isn't very popular not because it isn't really good, but
because it isn't a specialised structure. It is, however, almost there
for almost every circumstance. This can help keep the standard library
clean, especially of tree data structures.

Here's what we kill:

* Linked lists and doubly-linked lists, which are scarily popular for
whatever reason. Sometimes people claim that collections.deque isn't
powerful enough for whatever they want, and blist will almost
definitely sate those cases.

* Balanced trees, with blist.sortedlist. This is actually needed right now.

* Poor performance in the cases where a lot of list merging and pruning happens.

* Most uses of bisect.

* Some instances where two data structures are used in parallel in
order to keep performance fast on disparate operations (like `x in y`
and `y[i]`).

Now, I understand there are downsides to blist. Particularly, I've
looked through the "benchmarks" and they seem untruthful. Further,
we'd need a maintainer. Finally, nobody jumps at blists because
they're rarely the obvious solution. Rather, they attempt to be a
different general solution. Hopefully, though, a stdlib inclusion
could make them a lot more obvious, and support in some current
libraries could make them feel more at home.

I don't know whether this is a good idea, but I do feel that it is
more promising and general than having a graph in the standard
library.

[toc] | [prev] | [next] | [standalone]


#68464 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-18 00:05 +0200
SubjectRe: Balanced trees
Message-ID<87mwgoqy4k.fsf@elektro.pacujo.net>
In reply to#68364
Joshua Landau <joshua@landau.ws>:

> The thing we really need is for the blist containers to become stdlib
> (but not to replace the current list implementation).

Very interesting. Downloaded blist but didn't compile it yet. It *could*
be the missing link.

I would *love* to see some comparative performance results between
blist.sorteddict and an AVL tree. A B+ tree could be nicer to the CPU
cache, but then again, the cache line is only a few pointers wide and
there appears to be a lot of shoving stuff left and right -- based on a
10-second "analysis" of the code:

        while (src >= stop)
                *dst-- = *src--;

Personally, I find it a bit odd to place lists at the center of the
proposal and have trees come out as a side effect. I would rather see it
the other way round: sorteddict -> sortedset et al.

> * It reduces demand for trees:

I would hope it would satisfy the demand for trees.

> nobody jumps at blists because they're rarely the obvious solution.

I haven't jumped at them because they were nowhere to be found on <URL:
http://docs.python.org/3/genindex-B.html>.


Marko

[toc] | [prev] | [next] | [standalone]


#68513 — Re: Balanced trees

FromDan Stromberg <drsalists@gmail.com>
Date2014-03-18 12:26 -0700
SubjectRe: Balanced trees
Message-ID<mailman.8257.1395170778.18130.python-list@python.org>
In reply to#68464
On Mon, Mar 17, 2014 at 3:05 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Joshua Landau <joshua@landau.ws>:
>
>> The thing we really need is for the blist containers to become stdlib
>> (but not to replace the current list implementation).
>
> Very interesting. Downloaded blist but didn't compile it yet. It *could*
> be the missing link.
>
> I would *love* to see some comparative performance results between
> blist.sorteddict and an AVL tree.

I added blist.sorteddict and removed (temporarily) Pypy and Jython.
The results are at
http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/

In short, blist.sorteddict didn't do that well, despite being in C.
In the random workloads, blist.sorteddict was dead last.  In the
sequential workloads blist.sorteddict fell somewhere in the middle.

I excluded Pypy and Jython because blist.sorteddict probably doesn't
run on them.

HTH

[toc] | [prev] | [next] | [standalone]


#68518 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-18 22:55 +0200
SubjectRe: Balanced trees
Message-ID<8738ifqlaw.fsf@elektro.pacujo.net>
In reply to#68513
Dan Stromberg <drsalists@gmail.com>:

> The results are at
> http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/

Unfortunately I'm having a hard time understanding the results.

The 50/50 get/set ratio is most interesting to me.

I'm seeing (under cpython-3.3):


    Size: 1048576, duration:  75.3, dictionary type: dict
    [...]
    Size:  262144, duration:  66.1, dictionary type: AVL_tree
    [...]
    Size:   65536, duration:  77.3, dictionary type: blist.sorteddict 


What does it mean?


Marko

[toc] | [prev] | [next] | [standalone]


#68520 — Re: Balanced trees

FromDan Stromberg <drsalists@gmail.com>
Date2014-03-18 14:45 -0700
SubjectRe: Balanced trees
Message-ID<mailman.8263.1395179161.18130.python-list@python.org>
In reply to#68518
On Tue, Mar 18, 2014 at 1:55 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Dan Stromberg <drsalists@gmail.com>:
>
>> The results are at
>> http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/
>
> Unfortunately I'm having a hard time understanding the results.
>
> The 50/50 get/set ratio is most interesting to me.
>
> I'm seeing (under cpython-3.3):
>
>
>     Size: 1048576, duration:  75.3, dictionary type: dict
>     [...]
>     Size:  262144, duration:  66.1, dictionary type: AVL_tree
>     [...]
>     Size:   65536, duration:  77.3, dictionary type: blist.sorteddict
>
>
> What does it mean?

dict was able to do 1048576 operations on a dictionary before taking
more than 120 seconds to complete - it took 75.3 seconds to do 1048576
operations.

AVL_tree was able to do 262144 operations on a dictionary before
taking more than 120 seconds to complete - it took 66.1 seconds to do
262144 operations.

blist.sorteddict was able to do 65536 operations on a dictionary
before taking more than 120 seconds to complete  - it took 77.3
seconds to do 65536 operations.

For the 50/50 workload; the "operations" were half adding key, value
pairs; and half lookups of values by keys we know are in the
dictionary.

I used to run all the dictionaries for as long as it took to do 4
million operations, but for (EG) unbalanced binary trees, that would
take far too long in the ordered tests, so I changed the code to try a
given tree type until the time for an operation became prohibitive.

If you look at the graphs (I have to admit they've become a little
cluttered), you can see the slower trees "escaping" rapidly (exceeding
the 120 second threshold), while the better performing trees grow more
slowly and are allowed to continue proving themselves longer.
Inspecting these graphs may help in developing an intuition for how
the tests were conducted.

The code implementing this method of testing is in
http://stromberg.dnsalias.org/svn/python-tree-and-heap-comparison/trunk/tester

HTH

[toc] | [prev] | [next] | [standalone]


#68522 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-19 00:03 +0200
SubjectRe: Balanced trees
Message-ID<87wqfrp3jk.fsf@elektro.pacujo.net>
In reply to#68520
Dan Stromberg <drsalists@gmail.com>:

> dict was able to do 1048576 operations on a dictionary before taking
> more than 120 seconds to complete - it took 75.3 seconds to do 1048576
> operations.
>
> AVL_tree was able to do 262144 operations on a dictionary before
> taking more than 120 seconds to complete - it took 66.1 seconds to do
> 262144 operations.
>
> blist.sorteddict was able to do 65536 operations on a dictionary
> before taking more than 120 seconds to complete - it took 77.3 seconds
> to do 65536 operations.

For a proper comparison, I'd like a fixed, identical dataset and set of
operations run against each data structure.

How about this test program:

   generate random datasets of 100, 10000, 1000000 and 100000000 elements
   generate random testset of 1000000 elements
   for each data structure:
        for each dataset:
             initialize data structure with dataset
             head, tail = testset[:100], testset[100:]
             t0 = current timestamp
             for each element in head:
                  add element to data structure
             for each element in tail:
                  add element to data structure
                  append element to head
                  remove head.pop(0) from data structure
             for each element in head:
                  remove element from data structure
             t1 = current timestamp
             report data structure type, dataset size, t1 - t0


Marko

[toc] | [prev] | [next] | [standalone]


#68523 — Re: Balanced trees

FromDan Stromberg <drsalists@gmail.com>
Date2014-03-18 15:21 -0700
SubjectRe: Balanced trees
Message-ID<mailman.8264.1395181297.18130.python-list@python.org>
In reply to#68522
On Tue, Mar 18, 2014 at 3:03 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Dan Stromberg <drsalists@gmail.com>:
> For a proper comparison, I'd like a fixed, identical dataset and set of
> operations run against each data structure.
>
> How about this test program:

I used to do essentially this, but it was time-prohibitive and
produced harder-to-read graphs - harder to read because the enormous
values of the bad trees were dwarfing the values of the good trees.

Imagine doing 100000000 operation tests for the unbalanced binary
tree. For a series of random keys, it would do quite well (probably
second only to dict), but for a series of sequential keys it would
take longer than anyone would reasonably want to wait because it's
basically a storage-inefficient linked list.

Rather than throw out unbalanced binary tree altogether, it makes more
sense to run it until it gets "too slow".

The workload+interpreter pairs are all tested the same way, it's just
that the ones that are doing badly are thrown out before they're able
to get a lot worse. Studying the graphs will likely help develop an
intuition for what's happening.

[toc] | [prev] | [next] | [standalone]


#68525 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-19 01:11 +0200
SubjectRe: Balanced trees
Message-ID<87r45zf6fu.fsf@elektro.pacujo.net>
In reply to#68523
Dan Stromberg <drsalists@gmail.com>:

> On Tue, Mar 18, 2014 at 3:03 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> Dan Stromberg <drsalists@gmail.com>:
>> For a proper comparison, I'd like a fixed, identical dataset and set
>> of operations run against each data structure.
>>
>> How about this test program:
>
> I used to do essentially this, but it was time-prohibitive and
> produced harder-to-read graphs - harder to read because the enormous
> values of the bad trees were dwarfing the values of the good trees.
>
> Imagine doing 100000000 operation tests for the unbalanced binary
> tree. For a series of random keys, it would do quite well (probably
> second only to dict), but for a series of sequential keys it would
> take longer than anyone would reasonably want to wait because it's
> basically a storage-inefficient linked list.
>
> Rather than throw out unbalanced binary tree altogether, it makes more
> sense to run it until it gets "too slow".

I disagree strongly. You should throw out unbalanced binary trees and
linked lists and the like and concentrate on solid contenders.

But it's your test. You do as you like.

Anyway, even a well-thought-out test is subject to all kinds of
criticisms due to the CPU architecture, the distribution of the key
values, the quality of the data structure implementation etc etc.


Marko

[toc] | [prev] | [next] | [standalone]


#68530 — Re: Balanced trees

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-03-19 01:15 +0000
SubjectRe: Balanced trees
Message-ID<5328efab$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#68525
On Wed, 19 Mar 2014 01:11:33 +0200, Marko Rauhamaa wrote:

> Dan Stromberg <drsalists@gmail.com>:

>> Rather than throw out unbalanced binary tree altogether, it makes more
>> sense to run it until it gets "too slow".
> 
> I disagree strongly. You should throw out unbalanced binary trees and
> linked lists and the like and concentrate on solid contenders.

If you are in a position to randomize the data before storing it in the 
tree, an unbalanced binary tree is a solid contender. The overhead will 
likely be much less than any balanced tree, and the probability of 
degenerate behaviour negligible for any amount of data big enough to 
really matter.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#68539 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-19 10:49 +0200
SubjectRe: Balanced trees
Message-ID<87pplipo82.fsf@elektro.pacujo.net>
In reply to#68530
Steven D'Aprano <steve+comp.lang.python@pearwood.info>:

> If you are in a position to randomize the data before storing it in
> the tree, an unbalanced binary tree is a solid contender.

Applications that can assume randomly distributed data are exceedingly
rare and even then, you can't easily discount the possibility of
worst-case ordering.

In fact, Dan himself said unbalanced trees performed so badly he
couldn't test them satisfactorily.


Marko

[toc] | [prev] | [next] | [standalone]


#68551 — Re: Balanced trees

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-03-19 13:42 +0000
SubjectRe: Balanced trees
Message-ID<53299eac$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#68539
On Wed, 19 Mar 2014 10:49:33 +0200, Marko Rauhamaa wrote:

> Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
> 
>> If you are in a position to randomize the data before storing it in the
>> tree, an unbalanced binary tree is a solid contender.
> 
> Applications that can assume randomly distributed data are exceedingly
> rare 

Please re-read what I wrote. I didn't say "if your data comes to you 
fully randomized". I said "if you are in a position to randomize the data 
before storing it". In other words, if you actively randomize the data 
yourself.


> and even then, you can't easily discount the possibility of
> worst-case ordering.

Of course you can. If you have a million items, then the odds that those 
million items happen to be sorted (the worst-case order) are 1 in a 
million factorial. That's a rather small number, small enough that we can 
discount it from ever happening, not in a million lifetimes of the 
Universe.

Of course, you would be right to point out that one would also like to 
avoid *nearly* worst-case ordering. Nevertheless, there are Vastly more 
ways that the data will be sufficiently randomized as to avoid degenerate 
performance than the worst-case poor performance.

> In fact, Dan himself said unbalanced trees performed so badly he
> couldn't test them satisfactorily.

You're misrepresenting what Dan said. What he actually said was that 
unbalanced trees perform second only to dicts with random data, only with 
sorted data do they perform badly. His exact words were:

"For a series of random keys, it would do quite well (probably second only
to dict), but for a series of sequential keys it would take longer than
anyone would reasonably want to wait"



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]


Page 5 of 9 — ← Prev page 1 2 3 4 [5] 6 7 8 9  Next page →

Back to top | Article view | comp.lang.python


csiph-web