Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #67151 > unrolled thread
| Started by | Eric Jacoboni <eric.jacoboni@gmail.com> |
|---|---|
| First post | 2014-02-27 17:01 +0100 |
| Last post | 2014-03-10 07:06 +1100 |
| Articles | 20 on this page of 161 — 33 participants |
Back to article view | Back to comp.lang.python
Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:01 +0100
Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:13 -0600
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:27 +0100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:33 +1100
Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:47 -0600
Re: Tuples and immutability Nick Timkovich <prometheus235@gmail.com> - 2014-02-27 15:47 -0600
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 08:52 +1100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-27 15:18 -0700
Re: Tuples and immutability Rick Johnson <rantingrickjohnson@gmail.com> - 2014-03-11 19:01 -0700
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-12 13:10 +1100
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-11 23:25 -0400
Re: Tuples and immutability Steven D'Aprano <steve@pearwood.info> - 2014-03-12 06:28 +0000
Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:39 +0100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:40 -0600
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:51 -0600
Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 12:21 +0100
Re: Tuples and immutability Ethan Furman <ethan@stoneleaf.us> - 2014-03-11 23:32 -0700
Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-12 10:20 +0000
Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-01 18:55 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:14 +1100
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-27 18:19 +0200
Re: Tuples and immutability John O'Hagan <research@johnohagan.com> - 2014-02-28 16:17 +1100
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-28 09:54 +0200
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-02-28 14:41 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 01:43 +1100
Re: Tuples and immutability Duncan Booth <duncan.booth@invalid.invalid> - 2014-03-07 09:33 +0000
Re: Tuples and immutability Ben Finney <ben+python@benfinney.id.au> - 2014-03-07 22:04 +1100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:11 +1100
Re: Tuples and immutability Peter Otten <__peter__@web.de> - 2014-03-07 12:38 +0100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:45 +1100
Re: Tuples and immutability Alister <alister.ware@ntlworld.com> - 2014-03-07 11:51 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 11:23 -0700
Re: Tuples and immutability Roy Smith <roy@panix.com> - 2014-03-07 08:37 -0500
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-08 15:17 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 21:17 -0700
Balanced trees (was: Re: Tuples and immutability) Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 10:34 +0200
Re: Balanced trees (was: Re: Tuples and immutability) Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 04:03 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 13:26 +0200
Re: Balanced trees (was: Re: Tuples and immutability) Dan Stromberg <drsalists@gmail.com> - 2014-03-08 11:58 -0800
Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-08 20:37 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 23:21 +0200
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-08 17:22 -0500
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:17 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-08 17:31 -0800
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:27 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:20 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:32 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:37 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:43 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 15:08 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 00:24 +0200
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 18:04 -0600
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 08:16 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 08:53 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 11:41 +0200
Re: Balanced trees Ned Batchelder <ned@nedbatchelder.com> - 2014-03-10 06:57 -0400
Re: Balanced trees Rustom Mody <rustompmody@gmail.com> - 2014-03-10 09:01 -0700
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 02:02 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:20 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 13:29 +1100
Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-10 09:51 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 09:59 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:20 +1100
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:24 +1100
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:08 +0200
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 04:17 +1100
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:34 +0200
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-13 12:40 -0600
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-13 23:57 +0000
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-13 20:12 -0700
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-14 05:02 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 19:24 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 10:27 +1100
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 01:26 +0000
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 12:45 +1100
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 21:38 -0600
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 15:28 +1100
Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-12 00:57 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-11 12:12 +0200
Re: Balanced trees alex23 <wuwei23@gmail.com> - 2014-03-12 10:13 +1000
Re: Balanced trees Alister <alister.ware@ntlworld.com> - 2014-03-11 09:25 +0000
Re: Balanced trees Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:08 +0100
Re: Balanced trees Tim Chase <python.list@tim.thechases.com> - 2014-03-10 11:33 -0500
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:39 +1100
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-10 18:05 -0700
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:13 -0400
Re: Balanced trees Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-03-10 19:57 -0400
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-15 01:13 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 00:05 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 12:26 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 22:55 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 14:45 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 00:03 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 15:21 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 01:11 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 10:49 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 13:42 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 15:54 +0200
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-19 10:06 -0400
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
Re: Balanced trees Ethan Furman <ethan@stoneleaf.us> - 2014-03-19 08:15 -0700
Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-20 02:16 +0000
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-19 19:34 -0700
Re: Balanced trees Chris Kaynor <ckaynor@zindagigames.com> - 2014-03-18 18:02 -0700
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-18 13:18 -0700
blist in standard library (was Re: Balanced trees) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-15 12:31 +0000
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 14:16 -0700
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 00:08 +0000
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 18:01 -0700
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 07:46 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:40 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:39 -0700
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:35 +0200
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-10 11:03 +1300
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-09 19:00 -0400
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 17:42 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 02:37 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 02:35 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 09:13 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:15 +1300
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:03 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 04:39 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 16:46 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-12 10:23 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 17:06 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-12 23:20 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 19:35 -0600
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-12 22:09 -0400
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:45 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:55 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 16:22 -0800
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-01 02:27 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 20:45 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:34 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 21:50 -0800
Re: Tuples and immutability Ned Batchelder <ned@nedbatchelder.com> - 2014-03-01 12:56 -0500
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:26 -0700
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 12:39 +1100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:16 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:16 -0800
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 17:29 +1100
Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-01 14:54 +0000
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:01 -0800
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:25 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-01 12:45 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:21 -0800
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 03:04 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:28 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-02 05:32 -0700
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 14:38 +0100
Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-02 14:05 +0000
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 15:17 +0100
Re: Tuples and immutability "albert visser" <albert.visser@gmail.com> - 2014-03-02 15:37 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:15 -0800
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 17:54 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 05:13 +1100
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 19:57 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 07:06 +1100
Page 5 of 9 — ← Prev page 1 2 3 4 [5] 6 7 8 9 Next page →
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-11 12:12 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87wqg1oxh4.fsf@elektro.pacujo.net> |
| In reply to | #68187 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote: > >> In article <8761nmrnfk.fsf@elektro.pacujo.net>, >> Marko Rauhamaa <marko@pacujo.net> wrote: >> >>> Anyway, this whole debate is rather unnecessary since every developer >>> is supposed to have both weapons in their arsenal. >> >> The problem with having a choice is that it opens up the possibility of >> making the wrong one :-) > > [...] > > In my experience, the average developer has an amazing talent for > pessimising code when they think they are optimising it. Java is the straitjacket that prevents the "average developer" from shooting himself in the foot. Python should let skilled professionals do their work. Thankfully, for the most part, it does. Marko
[toc] | [prev] | [next] | [standalone]
| From | alex23 <wuwei23@gmail.com> |
|---|---|
| Date | 2014-03-12 10:13 +1000 |
| Subject | Re: Balanced trees |
| Message-ID | <lfo8qf$9sm$1@dont-email.me> |
| In reply to | #68206 |
On 11/03/2014 8:12 PM, Marko Rauhamaa wrote: > Python should let skilled professionals do their work. Thankfully, for > the most part, it does. Skilled professionals don't solely rely on the standard library, either. If you know you need a balanced tree, you'll also know where to find an implementation of one.
[toc] | [prev] | [next] | [standalone]
| From | Alister <alister.ware@ntlworld.com> |
|---|---|
| Date | 2014-03-11 09:25 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <ODATu.44616$a74.40039@fx20.am4> |
| In reply to | #68179 |
On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote: > In article <8761nmrnfk.fsf@elektro.pacujo.net>, > Marko Rauhamaa <marko@pacujo.net> wrote: > >> Anyway, this whole debate is rather unnecessary since every developer >> is supposed to have both weapons in their arsenal. > > The problem with having a choice is that it opens up the possibility of > making the wrong one :-) > > As this discussion has shown, figuring out whether a hash table or a > tree is better for a given problem is non-trivial. My guess is that if > you gave 1000 typical developers both data structures and let them pick > freely, the number of cases where it really mattered and the developer > picked the right one would be approximately equal to the number of cases > where they picked the wrong one. Perhaps you are not familiar with the 50/50/90 rule? Given any 50/50 choice you are 90% certain to pick the wrong one :-) -- I hold it, that a little rebellion, now and then, is a good thing... -- Thomas Jefferson
[toc] | [prev] | [next] | [standalone]
| From | Antoon Pardon <antoon.pardon@rece.vub.ac.be> |
|---|---|
| Date | 2014-03-12 10:08 +0100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8075.1394615335.18130.python-list@python.org> |
| In reply to | #68179 |
Op 11-03-14 00:24, Roy Smith schreef: > In article <8761nmrnfk.fsf@elektro.pacujo.net>, > Marko Rauhamaa <marko@pacujo.net> wrote: > >> Anyway, this whole debate is rather unnecessary since every developer is >> supposed to have both weapons in their arsenal. > The problem with having a choice is that it opens up the possibility of > making the wrong one :-) This is just a standard defense for the status quo. Introducing the decimal module also added a choice. > As this discussion has shown, figuring out whether a hash table or a > tree is better for a given problem is non-trivial. My guess is that if > you gave 1000 typical developers both data structures and let them pick > freely, the number of cases where it really mattered and the developer > picked the right one would be approximately equal to the number of cases > where they picked the wrong one. You are only illustrating one part. How about all those cases now where the wrong choice is more or less forced on the developer for lack of the alternative? -- Antoon Pardon
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2014-03-10 11:33 -0500 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8009.1394469217.18130.python-list@python.org> |
| In reply to | #68143 |
On 2014-03-11 03:24, Chris Angelico wrote: > Imagine, worst case, all one million records have the same > song/user/add_time and you need to do twenty comparisons involving > four fields. That's gotta be worse than one hashing of five fields. And if you have one million songs that are indistinguishable, you have Pop Top-40 playlists. And that is *definitely* worse than hashes OR trees. :-) -tkc
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 03:39 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8011.1394469568.18130.python-list@python.org> |
| In reply to | #68143 |
On Tue, Mar 11, 2014 at 3:33 AM, Tim Chase <python.list@tim.thechases.com> wrote: > On 2014-03-11 03:24, Chris Angelico wrote: >> Imagine, worst case, all one million records have the same >> song/user/add_time and you need to do twenty comparisons involving >> four fields. That's gotta be worse than one hashing of five fields. > > And if you have one million songs that are indistinguishable, you > have Pop Top-40 playlists. And that is *definitely* worse than > hashes OR trees. :-) LOL! That's not a Comp Sci problem, I'm afraid. But there is a solution: https://www.youtube.com/watch?v=jCZzvZEmq_A https://www.youtube.com/watch?v=DEr_5C6JYB4 A different sort of music. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2014-03-10 18:05 -0700 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8031.1394499924.18130.python-list@python.org> |
| In reply to | #68143 |
On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <roy@panix.com> wrote: > On the other hand, log n, for n = 1 million, is just 20. It's not hard > to imagine a hash function which costs 20x what a node traversal does, > in which case, the log n lookup is ahead for all n < 1 million. FWIW, both the hash table and the tree will have constants. So a tree would be c*20 in its most significant term, and the hash table would be d*1 in its. The real-world performance depends quite a bit on those constants at small values of n. I don't really consider a million all that big, but the meaning of "big" of course depends.
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-03-10 22:13 -0400 |
| Subject | Re: Balanced trees |
| Message-ID | <roy-19EAAB.22135410032014@news.panix.com> |
| In reply to | #68186 |
In article <mailman.8031.1394499924.18130.python-list@python.org>, Dan Stromberg <drsalists@gmail.com> wrote: > On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <roy@panix.com> wrote: > > On the other hand, log n, for n = 1 million, is just 20. It's not hard > > to imagine a hash function which costs 20x what a node traversal does, > > in which case, the log n lookup is ahead for all n < 1 million. > > FWIW, both the hash table and the tree will have constants. So a tree > would be c*20 in its most significant term, and the hash table would > be d*1 in its. The real-world performance depends quite a bit on > those constants at small values of n. I don't really consider a > million all that big, but the meaning of "big" of course depends. Well, the largest disk volume I can configure in AWS is 1 TB. So, I guess we can take that to be "big". Assuming 1-character strings, and no overhead (both strange assumptions, but it makes the math easier), that's 10^12 nodes in our binary tree. That's still only 40 layers deep. Log n is your friend.
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2014-03-10 19:57 -0400 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8028.1394495825.18130.python-list@python.org> |
| In reply to | #68117 |
On 10 Mar 2014 03:24:08 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following:
>
>Proof: I create a hash table that accepts unsigned bytes as keys, where
>the hash is the value of the byte. So byte 0x42 hashes to 0x42, or
>decimal 66. I give the hash table 256 slots, numbered from 0 to 255.
>Every hash takes constant time, there are no collisions at all, lookups,
>insertions and deletions all take constant time regardless of how full
>the table is. The best, worst and average cases are not just O(1) but
>also ?(1).
>
What you've defined, however, can be replaced with straight list
indexing <G> No hash function needed.
If there is a finite (and small) number of possible keys, the list will
likely beat the hash.
OTOH, when the keys exceed reasonable list storage, the hash becomes
useful -- but one now has to be concerned with collisions.
Back in the day, I was somewhat pleased to discover that the Amiga
directory structure was a real world example of something I'd only seen as
a class assignment in my data structures class: a hashed-head
multiple-linked list.
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua@landau.ws> |
|---|---|
| Date | 2014-03-15 01:13 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8154.1394846042.18130.python-list@python.org> |
| In reply to | #68026 |
On 8 March 2014 20:37, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> I've found this link useful http://kmike.ru/python-data-structures/
>
> I also don't want all sorts of data structures added to the Python library.
> I believe that there are advantages to leaving specialist data structures on
> pypi or other sites, plus it means Python in a Nutshell can still fit in
> your pocket and not a 40 ton articulated lorry, unlike the Java equivalent.
The thing we really need is for the blist containers to become stdlib
(but not to replace the current list implementation). The rejected PEP
(http://legacy.python.org/dev/peps/pep-3128/) misses a few important
points, largely in how the "log(n)" has a really large base:
random.choice went from 1.2µs to 1.6µs from n=1 to n=10⁸, vs 1.2µs for
a standard list.
Further, it's worth considering a few advantages:
* copy is O(1), allowing code to avoid mutation by just copying its
input, which is good practice.
* FIFO is effectively O(1), as the time just about doubles from n=1 to
n=10⁸ so will never actually branch that much. There is still a speed
benefit of collections.deque, but it's much, much less significant.
This is very useful when considering usage as a multi-purpose data
structure, and removes demand for explicit linked lists (which have
foolishly been reimplemented loads of times).
* It reduces demand for trees:
* There are efficient implementations of sortedlist, sortedset and
sorteddict.
* Slicing, slice assignment and slice deletion are really fast.
* Addition of lists is sublinear. Instead of
"list(itertools.chain(...))", one can add in a loop and end up
*faster*.
I think blist isn't very popular not because it isn't really good, but
because it isn't a specialised structure. It is, however, almost there
for almost every circumstance. This can help keep the standard library
clean, especially of tree data structures.
Here's what we kill:
* Linked lists and doubly-linked lists, which are scarily popular for
whatever reason. Sometimes people claim that collections.deque isn't
powerful enough for whatever they want, and blist will almost
definitely sate those cases.
* Balanced trees, with blist.sortedlist. This is actually needed right now.
* Poor performance in the cases where a lot of list merging and pruning happens.
* Most uses of bisect.
* Some instances where two data structures are used in parallel in
order to keep performance fast on disparate operations (like `x in y`
and `y[i]`).
Now, I understand there are downsides to blist. Particularly, I've
looked through the "benchmarks" and they seem untruthful. Further,
we'd need a maintainer. Finally, nobody jumps at blists because
they're rarely the obvious solution. Rather, they attempt to be a
different general solution. Hopefully, though, a stdlib inclusion
could make them a lot more obvious, and support in some current
libraries could make them feel more at home.
I don't know whether this is a good idea, but I do feel that it is
more promising and general than having a graph in the standard
library.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-18 00:05 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87mwgoqy4k.fsf@elektro.pacujo.net> |
| In reply to | #68364 |
Joshua Landau <joshua@landau.ws>:
> The thing we really need is for the blist containers to become stdlib
> (but not to replace the current list implementation).
Very interesting. Downloaded blist but didn't compile it yet. It *could*
be the missing link.
I would *love* to see some comparative performance results between
blist.sorteddict and an AVL tree. A B+ tree could be nicer to the CPU
cache, but then again, the cache line is only a few pointers wide and
there appears to be a lot of shoving stuff left and right -- based on a
10-second "analysis" of the code:
while (src >= stop)
*dst-- = *src--;
Personally, I find it a bit odd to place lists at the center of the
proposal and have trees come out as a side effect. I would rather see it
the other way round: sorteddict -> sortedset et al.
> * It reduces demand for trees:
I would hope it would satisfy the demand for trees.
> nobody jumps at blists because they're rarely the obvious solution.
I haven't jumped at them because they were nowhere to be found on <URL:
http://docs.python.org/3/genindex-B.html>.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2014-03-18 12:26 -0700 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8257.1395170778.18130.python-list@python.org> |
| In reply to | #68464 |
On Mon, Mar 17, 2014 at 3:05 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Joshua Landau <joshua@landau.ws>: > >> The thing we really need is for the blist containers to become stdlib >> (but not to replace the current list implementation). > > Very interesting. Downloaded blist but didn't compile it yet. It *could* > be the missing link. > > I would *love* to see some comparative performance results between > blist.sorteddict and an AVL tree. I added blist.sorteddict and removed (temporarily) Pypy and Jython. The results are at http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/ In short, blist.sorteddict didn't do that well, despite being in C. In the random workloads, blist.sorteddict was dead last. In the sequential workloads blist.sorteddict fell somewhere in the middle. I excluded Pypy and Jython because blist.sorteddict probably doesn't run on them. HTH
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-18 22:55 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <8738ifqlaw.fsf@elektro.pacujo.net> |
| In reply to | #68513 |
Dan Stromberg <drsalists@gmail.com>:
> The results are at
> http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/
Unfortunately I'm having a hard time understanding the results.
The 50/50 get/set ratio is most interesting to me.
I'm seeing (under cpython-3.3):
Size: 1048576, duration: 75.3, dictionary type: dict
[...]
Size: 262144, duration: 66.1, dictionary type: AVL_tree
[...]
Size: 65536, duration: 77.3, dictionary type: blist.sorteddict
What does it mean?
Marko
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2014-03-18 14:45 -0700 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8263.1395179161.18130.python-list@python.org> |
| In reply to | #68518 |
On Tue, Mar 18, 2014 at 1:55 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Dan Stromberg <drsalists@gmail.com>: > >> The results are at >> http://stromberg.dnsalias.org/~strombrg/python-tree-and-heap-comparison/2014-03/ > > Unfortunately I'm having a hard time understanding the results. > > The 50/50 get/set ratio is most interesting to me. > > I'm seeing (under cpython-3.3): > > > Size: 1048576, duration: 75.3, dictionary type: dict > [...] > Size: 262144, duration: 66.1, dictionary type: AVL_tree > [...] > Size: 65536, duration: 77.3, dictionary type: blist.sorteddict > > > What does it mean? dict was able to do 1048576 operations on a dictionary before taking more than 120 seconds to complete - it took 75.3 seconds to do 1048576 operations. AVL_tree was able to do 262144 operations on a dictionary before taking more than 120 seconds to complete - it took 66.1 seconds to do 262144 operations. blist.sorteddict was able to do 65536 operations on a dictionary before taking more than 120 seconds to complete - it took 77.3 seconds to do 65536 operations. For the 50/50 workload; the "operations" were half adding key, value pairs; and half lookups of values by keys we know are in the dictionary. I used to run all the dictionaries for as long as it took to do 4 million operations, but for (EG) unbalanced binary trees, that would take far too long in the ordered tests, so I changed the code to try a given tree type until the time for an operation became prohibitive. If you look at the graphs (I have to admit they've become a little cluttered), you can see the slower trees "escaping" rapidly (exceeding the 120 second threshold), while the better performing trees grow more slowly and are allowed to continue proving themselves longer. Inspecting these graphs may help in developing an intuition for how the tests were conducted. The code implementing this method of testing is in http://stromberg.dnsalias.org/svn/python-tree-and-heap-comparison/trunk/tester HTH
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-19 00:03 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87wqfrp3jk.fsf@elektro.pacujo.net> |
| In reply to | #68520 |
Dan Stromberg <drsalists@gmail.com>:
> dict was able to do 1048576 operations on a dictionary before taking
> more than 120 seconds to complete - it took 75.3 seconds to do 1048576
> operations.
>
> AVL_tree was able to do 262144 operations on a dictionary before
> taking more than 120 seconds to complete - it took 66.1 seconds to do
> 262144 operations.
>
> blist.sorteddict was able to do 65536 operations on a dictionary
> before taking more than 120 seconds to complete - it took 77.3 seconds
> to do 65536 operations.
For a proper comparison, I'd like a fixed, identical dataset and set of
operations run against each data structure.
How about this test program:
generate random datasets of 100, 10000, 1000000 and 100000000 elements
generate random testset of 1000000 elements
for each data structure:
for each dataset:
initialize data structure with dataset
head, tail = testset[:100], testset[100:]
t0 = current timestamp
for each element in head:
add element to data structure
for each element in tail:
add element to data structure
append element to head
remove head.pop(0) from data structure
for each element in head:
remove element from data structure
t1 = current timestamp
report data structure type, dataset size, t1 - t0
Marko
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2014-03-18 15:21 -0700 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8264.1395181297.18130.python-list@python.org> |
| In reply to | #68522 |
On Tue, Mar 18, 2014 at 3:03 PM, Marko Rauhamaa <marko@pacujo.net> wrote: > Dan Stromberg <drsalists@gmail.com>: > For a proper comparison, I'd like a fixed, identical dataset and set of > operations run against each data structure. > > How about this test program: I used to do essentially this, but it was time-prohibitive and produced harder-to-read graphs - harder to read because the enormous values of the bad trees were dwarfing the values of the good trees. Imagine doing 100000000 operation tests for the unbalanced binary tree. For a series of random keys, it would do quite well (probably second only to dict), but for a series of sequential keys it would take longer than anyone would reasonably want to wait because it's basically a storage-inefficient linked list. Rather than throw out unbalanced binary tree altogether, it makes more sense to run it until it gets "too slow". The workload+interpreter pairs are all tested the same way, it's just that the ones that are doing badly are thrown out before they're able to get a lot worse. Studying the graphs will likely help develop an intuition for what's happening.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-19 01:11 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87r45zf6fu.fsf@elektro.pacujo.net> |
| In reply to | #68523 |
Dan Stromberg <drsalists@gmail.com>: > On Tue, Mar 18, 2014 at 3:03 PM, Marko Rauhamaa <marko@pacujo.net> wrote: >> Dan Stromberg <drsalists@gmail.com>: >> For a proper comparison, I'd like a fixed, identical dataset and set >> of operations run against each data structure. >> >> How about this test program: > > I used to do essentially this, but it was time-prohibitive and > produced harder-to-read graphs - harder to read because the enormous > values of the bad trees were dwarfing the values of the good trees. > > Imagine doing 100000000 operation tests for the unbalanced binary > tree. For a series of random keys, it would do quite well (probably > second only to dict), but for a series of sequential keys it would > take longer than anyone would reasonably want to wait because it's > basically a storage-inefficient linked list. > > Rather than throw out unbalanced binary tree altogether, it makes more > sense to run it until it gets "too slow". I disagree strongly. You should throw out unbalanced binary trees and linked lists and the like and concentrate on solid contenders. But it's your test. You do as you like. Anyway, even a well-thought-out test is subject to all kinds of criticisms due to the CPU architecture, the distribution of the key values, the quality of the data structure implementation etc etc. Marko
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-03-19 01:15 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <5328efab$0$29994$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #68525 |
On Wed, 19 Mar 2014 01:11:33 +0200, Marko Rauhamaa wrote: > Dan Stromberg <drsalists@gmail.com>: >> Rather than throw out unbalanced binary tree altogether, it makes more >> sense to run it until it gets "too slow". > > I disagree strongly. You should throw out unbalanced binary trees and > linked lists and the like and concentrate on solid contenders. If you are in a position to randomize the data before storing it in the tree, an unbalanced binary tree is a solid contender. The overhead will likely be much less than any balanced tree, and the probability of degenerate behaviour negligible for any amount of data big enough to really matter. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-19 10:49 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87pplipo82.fsf@elektro.pacujo.net> |
| In reply to | #68530 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > If you are in a position to randomize the data before storing it in > the tree, an unbalanced binary tree is a solid contender. Applications that can assume randomly distributed data are exceedingly rare and even then, you can't easily discount the possibility of worst-case ordering. In fact, Dan himself said unbalanced trees performed so badly he couldn't test them satisfactorily. Marko
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-03-19 13:42 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <53299eac$0$29994$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #68539 |
On Wed, 19 Mar 2014 10:49:33 +0200, Marko Rauhamaa wrote: > Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > >> If you are in a position to randomize the data before storing it in the >> tree, an unbalanced binary tree is a solid contender. > > Applications that can assume randomly distributed data are exceedingly > rare Please re-read what I wrote. I didn't say "if your data comes to you fully randomized". I said "if you are in a position to randomize the data before storing it". In other words, if you actively randomize the data yourself. > and even then, you can't easily discount the possibility of > worst-case ordering. Of course you can. If you have a million items, then the odds that those million items happen to be sorted (the worst-case order) are 1 in a million factorial. That's a rather small number, small enough that we can discount it from ever happening, not in a million lifetimes of the Universe. Of course, you would be right to point out that one would also like to avoid *nearly* worst-case ordering. Nevertheless, there are Vastly more ways that the data will be sufficiently randomized as to avoid degenerate performance than the worst-case poor performance. > In fact, Dan himself said unbalanced trees performed so badly he > couldn't test them satisfactorily. You're misrepresenting what Dan said. What he actually said was that unbalanced trees perform second only to dicts with random data, only with sorted data do they perform badly. His exact words were: "For a series of random keys, it would do quite well (probably second only to dict), but for a series of sequential keys it would take longer than anyone would reasonably want to wait" -- Steven D'Aprano http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
Page 5 of 9 — ← Prev page 1 2 3 4 [5] 6 7 8 9 Next page →
Back to top | Article view | comp.lang.python
csiph-web