Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #67151 > unrolled thread

Tuples and immutability

Started byEric Jacoboni <eric.jacoboni@gmail.com>
First post2014-02-27 17:01 +0100
Last post2014-03-10 07:06 +1100
Articles 20 on this page of 161 — 33 participants

Back to article view | Back to comp.lang.python


Contents

  Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:01 +0100
    Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:13 -0600
      Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:27 +0100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:33 +1100
        Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:47 -0600
        Re: Tuples and immutability Nick Timkovich <prometheus235@gmail.com> - 2014-02-27 15:47 -0600
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 08:52 +1100
        Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-27 15:18 -0700
          Re: Tuples and immutability Rick Johnson <rantingrickjohnson@gmail.com> - 2014-03-11 19:01 -0700
            Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-12 13:10 +1100
            Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-11 23:25 -0400
              Re: Tuples and immutability Steven D'Aprano <steve@pearwood.info> - 2014-03-12 06:28 +0000
                Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:39 +0100
                Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:40 -0600
                Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:51 -0600
                Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 12:21 +0100
            Re: Tuples and immutability Ethan Furman <ethan@stoneleaf.us> - 2014-03-11 23:32 -0700
            Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-12 10:20 +0000
        Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-01 18:55 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:14 +1100
    Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-27 18:19 +0200
      Re: Tuples and immutability John O'Hagan <research@johnohagan.com> - 2014-02-28 16:17 +1100
        Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-28 09:54 +0200
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-02-28 14:41 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 01:43 +1100
      Re: Tuples and immutability Duncan Booth <duncan.booth@invalid.invalid> - 2014-03-07 09:33 +0000
        Re: Tuples and immutability Ben Finney <ben+python@benfinney.id.au> - 2014-03-07 22:04 +1100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:11 +1100
        Re: Tuples and immutability Peter Otten <__peter__@web.de> - 2014-03-07 12:38 +0100
        Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:45 +1100
        Re: Tuples and immutability Alister <alister.ware@ntlworld.com> - 2014-03-07 11:51 +0000
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 11:23 -0700
        Re: Tuples and immutability Roy Smith <roy@panix.com> - 2014-03-07 08:37 -0500
        Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-08 15:17 +1300
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 21:17 -0700
            Balanced trees (was: Re: Tuples and immutability) Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 10:34 +0200
              Re: Balanced trees (was: Re: Tuples and immutability) Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 04:03 -0700
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 13:26 +0200
              Re: Balanced trees (was: Re: Tuples and immutability) Dan Stromberg <drsalists@gmail.com> - 2014-03-08 11:58 -0800
              Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-08 20:37 +0000
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 23:21 +0200
                  Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-08 17:22 -0500
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:17 +0200
                  Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-08 17:31 -0800
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:27 +0200
                      Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:20 -0700
                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:32 +0200
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:37 -0700
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:43 +0200
                              Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 15:08 -0700
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 00:24 +0200
                              Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 18:04 -0600
                                Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
                          Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 08:16 +0200
                              Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 08:53 +0000
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 11:41 +0200
                                  Re: Balanced trees Ned Batchelder <ned@nedbatchelder.com> - 2014-03-10 06:57 -0400
                                    Re: Balanced trees Rustom Mody <rustompmody@gmail.com> - 2014-03-10 09:01 -0700
                                      Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 02:02 +0000
                                        Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:20 -0400
                                          Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 13:29 +1100
                                Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-10 09:51 +0000
                              Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 09:59 -0400
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:20 +1100
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:24 +1100
                                  Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:08 +0200
                                    Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 04:17 +1100
                                      Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:34 +0200
                                        Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-13 12:40 -0600
                                        Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-13 23:57 +0000
                                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-13 20:12 -0700
                                            Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-14 05:02 +0000
                                    Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 19:24 -0400
                                      Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 10:27 +1100
                                      Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 01:26 +0000
                                        Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 12:45 +1100
                                        Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 21:38 -0600
                                        Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 15:28 +1100
                                          Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-12 00:57 +0000
                                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-11 12:12 +0200
                                          Re: Balanced trees alex23 <wuwei23@gmail.com> - 2014-03-12 10:13 +1000
                                      Re: Balanced trees Alister <alister.ware@ntlworld.com> - 2014-03-11 09:25 +0000
                                      Re: Balanced trees Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:08 +0100
                                Re: Balanced trees Tim Chase <python.list@tim.thechases.com> - 2014-03-10 11:33 -0500
                                Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:39 +1100
                                Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-10 18:05 -0700
                                  Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:13 -0400
                            Re: Balanced trees Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-03-10 19:57 -0400
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-15 01:13 +0000
                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 00:05 +0200
                  Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 12:26 -0700
                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 22:55 +0200
                      Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 14:45 -0700
                        Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 00:03 +0200
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 15:21 -0700
                            Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 01:11 +0200
                              Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
                                Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 10:49 +0200
                                  Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 13:42 +0000
                                    Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 15:54 +0200
                                    Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-19 10:06 -0400
                            Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
                              Re: Balanced trees Ethan Furman <ethan@stoneleaf.us> - 2014-03-19 08:15 -0700
                        Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-20 02:16 +0000
                          Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-19 19:34 -0700
                      Re: Balanced trees Chris Kaynor <ckaynor@zindagigames.com> - 2014-03-18 18:02 -0700
                  Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-18 13:18 -0700
              blist in standard library  (was Re: Balanced trees) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-15 12:31 +0000
              Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 14:16 -0700
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 00:08 +0000
              Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 18:01 -0700
              Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 07:46 +0000
            Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:40 +1300
              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:39 -0700
                Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:35 +0200
                Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-10 11:03 +1300
                  Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-09 19:00 -0400
                  Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 17:42 -0600
                    Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 02:37 +0000
                      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 02:35 -0600
                        Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 09:13 +0000
                        Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:15 +1300
                    Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:03 +1300
                      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 04:39 -0600
                        Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 16:46 +0000
                          Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-12 10:23 +1300
                          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 17:06 -0600
                            Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-12 23:20 +0000
                              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 19:35 -0600
                              Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-12 22:09 -0400
            Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:45 +1300
              Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:55 -0700
    Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 16:22 -0800
      Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-01 02:27 +0100
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 20:45 -0800
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:34 -0700
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 21:50 -0800
              Re: Tuples and immutability Ned Batchelder <ned@nedbatchelder.com> - 2014-03-01 12:56 -0500
        Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:26 -0700
      Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 12:39 +1100
      Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:16 -0700
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:16 -0800
          Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 17:29 +1100
          Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-01 14:54 +0000
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:01 -0800
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:25 -0800
          Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-01 12:45 -0700
        Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:21 -0800
          Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 03:04 +0100
            Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:28 -0800
            Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-02 05:32 -0700
              Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 14:38 +0100
                Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-02 14:05 +0000
                  Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 15:17 +0100
                    Re: Tuples and immutability "albert visser" <albert.visser@gmail.com> - 2014-03-02 15:37 +0100
          Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:15 -0800
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 17:54 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 05:13 +1100
    Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 19:57 +0000
    Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 07:06 +1100

Page 4 of 9 — ← Prev page 1 2 3 [4] 5 6 7 8 9  Next page →


#68192 — Re: Balanced trees

FromRoy Smith <roy@panix.com>
Date2014-03-10 22:20 -0400
SubjectRe: Balanced trees
Message-ID<roy-87AEA3.22201010032014@news.panix.com>
In reply to#68190
In article <531e6eca$0$29994$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> There's only so much matter in the 
> universe, so talking about limits as the amount of data approaches 
> infinity is nonsense. Where would you store it?

Funny you should ask that.  I just finished watching 
https://en.wikipedia.org/wiki/11001001 a few minutes ago.

[toc] | [prev] | [next] | [standalone]


#68200 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 13:29 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8035.1394517383.18130.python-list@python.org>
In reply to#68192
On Tue, Mar 11, 2014 at 1:20 PM, Roy Smith <roy@panix.com> wrote:
> In article <531e6eca$0$29994$c3e8da3$5496439d@news.astraweb.com>,
>  Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
>
>> There's only so much matter in the
>> universe, so talking about limits as the amount of data approaches
>> infinity is nonsense. Where would you store it?
>
> Funny you should ask that.  I just finished watching
> https://en.wikipedia.org/wiki/11001001 a few minutes ago.

Multiple universes and/or sub-atomic monkeys.

http://tools.ietf.org/html/rfc2795

ChrisA

[toc] | [prev] | [next] | [standalone]


#68136 — Re: Balanced trees

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-03-10 09:51 +0000
SubjectRe: Balanced trees
Message-ID<mailman.7989.1394445097.18130.python-list@python.org>
In reply to#68131
On 10/03/2014 08:53, Steven D'Aprano wrote:
>
> In other words, I suspect you are trolling.
>

s/suspect/know/ , he didn't make captain of my dream team for nothing, 
you know :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com

[toc] | [prev] | [next] | [standalone]


#68143 — Re: Balanced trees

FromRoy Smith <roy@panix.com>
Date2014-03-10 09:59 -0400
SubjectRe: Balanced trees
Message-ID<roy-BE3C76.09590110032014@news.panix.com>
In reply to#68121
In article <87eh2atw6s.fsf@elektro.pacujo.net>,
 Marko Rauhamaa <marko@pacujo.net> wrote:

> Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
> 
> > Proof: I create a hash table that accepts unsigned bytes as keys, where 
> 
> The O(f(n)) notation has no meaning when n is limited.
> 
> This thing is not just pedantry. The discussion was how a balanced tree
> fares in comparison with hash tables. A simplistic O(1) vs O(log n) was
> presented as a proof that balanced trees don't have a chance in practice
> or theory. I wasn't so easily convinced.
> 
> 
> Marko

On the other hand, log n, for n = 1 million, is just 20.  It's not hard 
to imagine a hash function which costs 20x what a node traversal does, 
in which case, the log n lookup is ahead for all n < 1 million.

Looking at the Songza source, I see we have one user-defined hash 
function:

    def __hash__(self):
        return hash((self.song,
                     self.user,
                     self.add_time,
                     self.delete_time,
                     self.play_first))

where those things are 2 bson.ObjectId's, 2 datetimes, and a boolean.  I 
wouldn't be surprised if that falls into the 20x node traversal bucket.  
In this case, convenience was more important than speed, so it doesn't 
matter.

The hash vs. tree argument can get very complicated.  For example, if 
your tree is not completely resident in memory, the cost of paging in a 
node will swamp everything else, and improving lookup speed will boil 
down to reducing the number of I/O operations you need.  An expensive 
hash plus a linear walk through a collision chain which was resident in 
a single memory block will beat traversing two nodes, each of which had 
to be paged in separately.

[toc] | [prev] | [next] | [standalone]


#68156 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 03:20 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8006.1394468411.18130.python-list@python.org>
In reply to#68143
On Tue, Mar 11, 2014 at 12:59 AM, Roy Smith <roy@panix.com> wrote:
> The hash vs. tree argument can get very complicated.  For example, if
> your tree is not completely resident in memory, the cost of paging in a
> node will swamp everything else, and improving lookup speed will boil
> down to reducing the number of I/O operations you need.  An expensive
> hash plus a linear walk through a collision chain which was resident in
> a single memory block will beat traversing two nodes, each of which had
> to be paged in separately.

Indeed, which is broadly an extension of the "cache locality" argument.

I've never actually yearned for any of the advanced operations a tree
can give (over a hash table). Usually, by the time I'm looking for
that sort of thing, I really want an on-disk database - that solves
all the problems of paging (the DBMS will make sure it reads in a
minimum of index and data pages), and a good SQL database can handle
multiple indexes in a space-efficient way. Plus, what you said about
log 1,000,000? By the time you're looking at a million records, you
probably (a) need to have them on disk somewhere anyway, and (b) don't
want to read them all into RAM before you begin.

ChrisA

[toc] | [prev] | [next] | [standalone]


#68157 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 03:24 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8007.1394468681.18130.python-list@python.org>
In reply to#68143
On Tue, Mar 11, 2014 at 12:59 AM, Roy Smith <roy@panix.com> wrote:
> Looking at the Songza source, I see we have one user-defined hash
> function:
>
>     def __hash__(self):
>         return hash((self.song,
>                      self.user,
>                      self.add_time,
>                      self.delete_time,
>                      self.play_first))
>
> where those things are 2 bson.ObjectId's, 2 datetimes, and a boolean.  I
> wouldn't be surprised if that falls into the 20x node traversal bucket.
> In this case, convenience was more important than speed, so it doesn't
> matter.

The only difference between a tree and a hash here is that the tree
might be able to short-cut the comparisons. But if there are a whole
bunch of songs with the same "song" and "user", then the tree has to
compare (song->song? same; user->user? same; add_time->add_time?
left/right) multiple times. So what you're saying is that the hash
table has consistency but the tree could do better or could do worse.
(Imagine, worst case, all one million records have the same
song/user/add_time and you need to do twenty comparisons involving
four fields. That's gotta be worse than one hashing of five fields.)

ChrisA

[toc] | [prev] | [next] | [standalone]


#68163 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-10 19:08 +0200
SubjectRe: Balanced trees
Message-ID<8761nmrnfk.fsf@elektro.pacujo.net>
In reply to#68157
Chris Angelico <rosuav@gmail.com>:

> The only difference between a tree and a hash here is that the tree
> might be able to short-cut the comparisons. But if there are a whole
> bunch of songs with the same "song" and "user", then the tree has to
> compare (song->song? same; user->user? same; add_time->add_time?
> left/right) multiple times. So what you're saying is that the hash
> table has consistency but the tree could do better or could do worse.
> (Imagine, worst case, all one million records have the same
> song/user/add_time and you need to do twenty comparisons involving
> four fields. That's gotta be worse than one hashing of five fields.)

You are right that in the end there probably is an equality test
involving all fields of the key. However, the intermediate comparisons
are often dealt with much more immediately. For example, to compare the
words "hello" and "world", you only need to look at the first letter.

Anyway, this whole debate is rather unnecessary since every developer is
supposed to have both weapons in their arsenal.


Marko

[toc] | [prev] | [next] | [standalone]


#68164 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 04:17 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8013.1394471845.18130.python-list@python.org>
In reply to#68163
On Tue, Mar 11, 2014 at 4:08 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> The only difference between a tree and a hash here is that the tree
>> might be able to short-cut the comparisons. But if there are a whole
>> bunch of songs with the same "song" and "user", then the tree has to
>> compare (song->song? same; user->user? same; add_time->add_time?
>> left/right) multiple times. So what you're saying is that the hash
>> table has consistency but the tree could do better or could do worse.
>> (Imagine, worst case, all one million records have the same
>> song/user/add_time and you need to do twenty comparisons involving
>> four fields. That's gotta be worse than one hashing of five fields.)
>
> You are right that in the end there probably is an equality test
> involving all fields of the key. However, the intermediate comparisons
> are often dealt with much more immediately. For example, to compare the
> words "hello" and "world", you only need to look at the first letter.

Right, which is what I meant about the consistency of a hash table.
You know, with a hash table, that you're going to need to calculate
the hash across everything. The tree lets you cheaply add
disambiguation fields, knowing they'll be used only if they're needed.

Oddly enough, the consistency in design is inverted by the
predictability under attack. If an attacker might be able to generate
hash collisions, the tree (assuming it's self-balancing) will have its
worst-case much closer to its average/typical cases, which means the
tree is more consistent. Strange how that goes, sometimes.

> Anyway, this whole debate is rather unnecessary since every developer is
> supposed to have both weapons in their arsenal.

Supposed to have? What does that mean, a language isn't ISO-compliant
unless it provides both? With a high level language like Python, using
the provided hash table will almost always cream any hand-built tree,
no matter how advantageous the data is to the tree.

ChrisA

[toc] | [prev] | [next] | [standalone]


#68166 — Re: Balanced trees

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-10 19:34 +0200
SubjectRe: Balanced trees
Message-ID<87zjkyq7nr.fsf@elektro.pacujo.net>
In reply to#68164
Chris Angelico <rosuav@gmail.com>:

> Supposed to have? What does that mean, a language isn't ISO-compliant
> unless it provides both?

It's an ancient, fundamental data structure, right up there with dynamic
lists. There's no reason it shouldn't be available in every programming
environment.

> With a high level language like Python, using the provided hash table
> will almost always cream any hand-built tree, no matter how
> advantageous the data is to the tree.

The main thing is there are use cases where order is essential. That's
why I have had to implement the AVL tree in Python myself. No biggie,
but a C implementation would probably be much faster. Also, a standard
version would likely be reviewed and tested better and have all Pythonic
accessors in place.


Marko

[toc] | [prev] | [next] | [standalone]


#68339 — Re: Balanced trees

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-03-13 12:40 -0600
SubjectRe: Balanced trees
Message-ID<mailman.8133.1394736421.18130.python-list@python.org>
In reply to#68166
On Mon, Mar 10, 2014 at 11:34 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> The main thing is there are use cases where order is essential. That's
> why I have had to implement the AVL tree in Python myself. No biggie,
> but a C implementation would probably be much faster. Also, a standard
> version would likely be reviewed and tested better and have all Pythonic
> accessors in place.

That last is actually quite easy to achieve, especially using the
built-in ABCs.  Just subclass from collections.MutableMapping and
implement __len__, __iter__, __getitem__, __setitem__ and __delitem__;
and default implementations of the rest will be mixed in for you.  Or
you can override those with optimized implementations.

[toc] | [prev] | [next] | [standalone]


#68344 — Re: Balanced trees

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-03-13 23:57 +0000
SubjectRe: Balanced trees
Message-ID<532245fa$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#68166
On Mon, 10 Mar 2014 19:34:48 +0200, Marko Rauhamaa wrote:

>> With a high level language like Python, using the provided hash table
>> will almost always cream any hand-built tree, no matter how
>> advantageous the data is to the tree.
> 
> The main thing is there are use cases where order is essential. That's
> why I have had to implement the AVL tree in Python myself.

from collections import OrderedDict

gives you a fast, ordered mapping. In this case, the order is that of 
insertion order. If you want sort order instead, for "small" amounts of 
data, say below a million or ten million items, you're likely to have 
acceptable if not superior results by just sorting them items when and as 
needed.

Otherwise, I expect that following the basic technique of OrderedDict, 
and building your data structure from a dict and an associated list, 
keeping the two in sync, will be faster than a pure Python implementation 
of an AVL tree. But of course only testing it will make that clear.

http://code.activestate.com/recipes/576693-ordered-dictionary-for-py24/

Modifying the above recipe to keep items in something other than 
insertion order is left as an exercise.




-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]


#68354 — Re: Balanced trees

FromDan Stromberg <drsalists@gmail.com>
Date2014-03-13 20:12 -0700
SubjectRe: Balanced trees
Message-ID<mailman.8142.1394766770.18130.python-list@python.org>
In reply to#68344
On Thu, Mar 13, 2014 at 4:57 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Mon, 10 Mar 2014 19:34:48 +0200, Marko Rauhamaa wrote:
>
>>> With a high level language like Python, using the provided hash table
>>> will almost always cream any hand-built tree, no matter how
>>> advantageous the data is to the tree.
>>
>> The main thing is there are use cases where order is essential. That's
>> why I have had to implement the AVL tree in Python myself.
>
> from collections import OrderedDict
>
> gives you a fast, ordered mapping. In this case, the order is that of
> insertion order. If you want sort order instead, for "small" amounts of
> data, say below a million or ten million items, you're likely to have
> acceptable if not superior results by just sorting them items when and as
> needed.

This is one of my pet things.

Sorting the same (slightly tweaked) data inside of a tight loop is
rarely a good idea - despite the fact that the "sort" itself tends to
be O(n) with Python's rather awesome builtin sorting algorithm.  This
is because sorting inside a loop tends to yield O(n^2) algorithms, or
even O((n^2)*logn).

But if you're sorting once at the end of whatever other processing,
sorting is great.  It's (still) O(nlogn), but it's got a terrific
constant.

[toc] | [prev] | [next] | [standalone]


#68356 — Re: Balanced trees

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-03-14 05:02 +0000
SubjectRe: Balanced trees
Message-ID<53228d5d$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#68354
On Thu, 13 Mar 2014 20:12:41 -0700, Dan Stromberg wrote:

> Sorting the same (slightly tweaked) data inside of a tight loop is
> rarely a good idea - despite the fact that the "sort" itself tends to be
> O(n) with Python's rather awesome builtin sorting algorithm.  This is
> because sorting inside a loop tends to yield O(n^2) algorithms, or even
> O((n^2)*logn).

I agree with you except for the word "rarely". It depends on how much 
data you have, and in my experience most people think that N = 100 is a 
lot :-)

Remember that Big Oh doesn't say anything about the constant of 
proportionality. If you have one function that behaves like O(N) but with 
a constant factor of 1000, and another function that behaves like 
O(N**2) with a constant factor of 0.001, the second, theoretically 
slower, function will actually be faster for N < one million.

So *in practice*, a solution that involves a theoretically worse Big Oh 
function but a significantly lower constant factor may turn out to be 
better for all the values you care about. Given the difference in speed 
between Python's built-ins, and the code you can write yourself in pure 
Python, it is normally better to push as much work as you can onto the 
built-ins.

And of course, we should not make assumptions as to what is faster and 
what is slower. After 15+ years, I'm still surprised about what ends up 
being faster in Python.


> But if you're sorting once at the end of whatever other processing,
> sorting is great.  It's (still) O(nlogn), but it's got a terrific
> constant.

Exactly!



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#68179 — Re: Balanced trees

FromRoy Smith <roy@panix.com>
Date2014-03-10 19:24 -0400
SubjectRe: Balanced trees
Message-ID<roy-F30152.19240710032014@news.panix.com>
In reply to#68163
In article <8761nmrnfk.fsf@elektro.pacujo.net>,
 Marko Rauhamaa <marko@pacujo.net> wrote:

> Anyway, this whole debate is rather unnecessary since every developer is
> supposed to have both weapons in their arsenal.

The problem with having a choice is that it opens up the possibility of 
making the wrong one :-)

As this discussion has shown, figuring out whether a hash table or a 
tree is better for a given problem is non-trivial.  My guess is that if 
you gave 1000 typical developers both data structures and let them pick 
freely, the number of cases where it really mattered and the developer 
picked the right one would be approximately equal to the number of cases 
where they picked the wrong one.

[toc] | [prev] | [next] | [standalone]


#68180 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 10:27 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8025.1394494024.18130.python-list@python.org>
In reply to#68179
On Tue, Mar 11, 2014 at 10:24 AM, Roy Smith <roy@panix.com> wrote:
> As this discussion has shown, figuring out whether a hash table or a
> tree is better for a given problem is non-trivial.  My guess is that if
> you gave 1000 typical developers both data structures and let them pick
> freely, the number of cases where it really mattered and the developer
> picked the right one would be approximately equal to the number of cases
> where they picked the wrong one.

And both would be utterly dwarfed by the cases where it doesn't
matter, and everyone's paying a cost (having to choose) for no
benefit.

ChrisA

[toc] | [prev] | [next] | [standalone]


#68187 — Re: Balanced trees

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-03-11 01:26 +0000
SubjectRe: Balanced trees
Message-ID<531e6643$0$29994$c3e8da3$5496439d@news.astraweb.com>
In reply to#68179
On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote:

> In article <8761nmrnfk.fsf@elektro.pacujo.net>,
>  Marko Rauhamaa <marko@pacujo.net> wrote:
> 
>> Anyway, this whole debate is rather unnecessary since every developer
>> is supposed to have both weapons in their arsenal.
> 
> The problem with having a choice is that it opens up the possibility of
> making the wrong one :-)
> 
> As this discussion has shown, figuring out whether a hash table or a
> tree is better for a given problem is non-trivial.  My guess is that if
> you gave 1000 typical developers both data structures and let them pick
> freely, the number of cases where it really mattered and the developer
> picked the right one would be approximately equal to the number of cases
> where they picked the wrong one.

You're very optimistic.

In my experience, the average developer has an amazing talent for 
pessimising code when they think they are optimising it.




-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]


#68189 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 12:45 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8032.1394502844.18130.python-list@python.org>
In reply to#68187
On Tue, Mar 11, 2014 at 12:26 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> In my experience, the average developer has an amazing talent for
> pessimising code when they think they are optimising it.

I remember a number of incidents from personal experience when I was a
*very* average developer. One time, writing C++ code, I looked at the
disassembly and decided the compiler was doing a terrible job. No no,
I could make this so much better by using the 80x86 "REP MOVSW"
command (or commands, depending on your point of view). That would be
so much better than all those separate operations the silly compiler
was doing! Roughly an hour of fiddling later, making sure it all still
worked correctly, I discover that... hmm, it's not actually any
faster. Turns out the 80x86 string opcodes are really inefficient;
they're short (a one-byte command that says "read a
byte/word/doubleword from DS:SI, write it to ES:DI, and increment or
decrement SI and DI"), but not fast. In my defense, I at least did
measure before-and-after, and learned that I should back out that
change :)

ChrisA

[toc] | [prev] | [next] | [standalone]


#68194 — Re: Balanced trees

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-03-10 21:38 -0600
SubjectRe: Balanced trees
Message-ID<mailman.8033.1394509126.18130.python-list@python.org>
In reply to#68187
On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com> wrote:
> On Tue, Mar 11, 2014 at 12:26 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> In my experience, the average developer has an amazing talent for
>> pessimising code when they think they are optimising it.
>
> I remember a number of incidents from personal experience when I was a
> *very* average developer. One time, writing C++ code, I looked at the
> disassembly and decided the compiler was doing a terrible job. No no,
> I could make this so much better by using the 80x86 "REP MOVSW"
> command (or commands, depending on your point of view). That would be
> so much better than all those separate operations the silly compiler
> was doing! Roughly an hour of fiddling later, making sure it all still
> worked correctly, I discover that... hmm, it's not actually any
> faster. Turns out the 80x86 string opcodes are really inefficient;
> they're short (a one-byte command that says "read a
> byte/word/doubleword from DS:SI, write it to ES:DI, and increment or
> decrement SI and DI"), but not fast. In my defense, I at least did
> measure before-and-after, and learned that I should back out that
> change :)

Better to have tried and failed though than to have simply accepted
what the compiler was doing with no verification at all.

[toc] | [prev] | [next] | [standalone]


#68195 — Re: Balanced trees

FromChris Angelico <rosuav@gmail.com>
Date2014-03-11 15:28 +1100
SubjectRe: Balanced trees
Message-ID<mailman.8034.1394512113.18130.python-list@python.org>
In reply to#68187
On Tue, Mar 11, 2014 at 2:38 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com> wrote:
>> No no,
>> I could make this so much better by using the 80x86 "REP MOVSW"
>> command (or commands, depending on your point of view). That would be
>> so much better than all those separate operations the silly compiler
>> was doing! Roughly an hour of fiddling later, making sure it all still
>> worked correctly, I discover that... hmm, it's not actually any
>> faster.
>
> Better to have tried and failed though than to have simply accepted
> what the compiler was doing with no verification at all.

Maybe. But I've learned now that one guy who used to do assembly
language programming on an 8086 is unlikely to discover something that
the many authors of a C compiler haven't noticed. Yes, it's possible
there'll be something specific to my code, like if I'm doing a
strcpy-like operation that isn't *actually* strcpy (the function will
be optimized heavily, but a C-level loop might not be recognized), but
it's more likely the compiler knows better than I do.

That, by the way, was before I realized that *interpreter* writers are
more expert than I am, too, and therefore that I can trust a
heavily-optimized high level language to run my code faster than I
could write equivalent C.

ChrisA

[toc] | [prev] | [next] | [standalone]


#68254 — Re: Balanced trees

From"Rhodri James" <rhodri@wildebst.org.uk>
Date2014-03-12 00:57 +0000
SubjectRe: Balanced trees
Message-ID<op.xck3mxcw5079vu@gnudebeest>
In reply to#68195
On Tue, 11 Mar 2014 04:28:25 -0000, Chris Angelico <rosuav@gmail.com>  
wrote:

> On Tue, Mar 11, 2014 at 2:38 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
>> On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com>  
>> wrote:
>>> No no,
>>> I could make this so much better by using the 80x86 "REP MOVSW"
>>> command (or commands, depending on your point of view). That would be
>>> so much better than all those separate operations the silly compiler
>>> was doing! Roughly an hour of fiddling later, making sure it all still
>>> worked correctly, I discover that... hmm, it's not actually any
>>> faster.
>>
>> Better to have tried and failed though than to have simply accepted
>> what the compiler was doing with no verification at all.
>
> Maybe. But I've learned now that one guy who used to do assembly
> language programming on an 8086 is unlikely to discover something that
> the many authors of a C compiler haven't noticed. Yes, it's possible
> there'll be something specific to my code, like if I'm doing a
> strcpy-like operation that isn't *actually* strcpy (the function will
> be optimized heavily, but a C-level loop might not be recognized), but
> it's more likely the compiler knows better than I do.

That might be true for x86, but it isn't true for ARM for example.   
Apparently it's algorithmically hard to generate ARM code that makes  
efficient use of conditional instructions, and I can almost always write  
tighter, faster code than the compiler generates in consequence.  It's not  
always worth the extra coding time (longer now that I'm not in practise),  
but that's a separate matter.


-- 
Rhodri James *-* Wildebeest Herder to the Masses

[toc] | [prev] | [next] | [standalone]


Page 4 of 9 — ← Prev page 1 2 3 [4] 5 6 7 8 9  Next page →

Back to top | Article view | comp.lang.python


csiph-web