Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #67151 > unrolled thread
| Started by | Eric Jacoboni <eric.jacoboni@gmail.com> |
|---|---|
| First post | 2014-02-27 17:01 +0100 |
| Last post | 2014-03-10 07:06 +1100 |
| Articles | 20 on this page of 161 — 33 participants |
Back to article view | Back to comp.lang.python
Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:01 +0100
Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:13 -0600
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-02-27 17:27 +0100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:33 +1100
Re: Tuples and immutability Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-27 10:47 -0600
Re: Tuples and immutability Nick Timkovich <prometheus235@gmail.com> - 2014-02-27 15:47 -0600
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 08:52 +1100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-27 15:18 -0700
Re: Tuples and immutability Rick Johnson <rantingrickjohnson@gmail.com> - 2014-03-11 19:01 -0700
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-12 13:10 +1100
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-11 23:25 -0400
Re: Tuples and immutability Steven D'Aprano <steve@pearwood.info> - 2014-03-12 06:28 +0000
Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:39 +0100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:40 -0600
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 03:51 -0600
Re: Tuples and immutability Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 12:21 +0100
Re: Tuples and immutability Ethan Furman <ethan@stoneleaf.us> - 2014-03-11 23:32 -0700
Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-12 10:20 +0000
Re: Tuples and immutability Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2014-03-01 18:55 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-02-28 03:14 +1100
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-27 18:19 +0200
Re: Tuples and immutability John O'Hagan <research@johnohagan.com> - 2014-02-28 16:17 +1100
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-02-28 09:54 +0200
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-02-28 14:41 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 01:43 +1100
Re: Tuples and immutability Duncan Booth <duncan.booth@invalid.invalid> - 2014-03-07 09:33 +0000
Re: Tuples and immutability Ben Finney <ben+python@benfinney.id.au> - 2014-03-07 22:04 +1100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:11 +1100
Re: Tuples and immutability Peter Otten <__peter__@web.de> - 2014-03-07 12:38 +0100
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-07 22:45 +1100
Re: Tuples and immutability Alister <alister.ware@ntlworld.com> - 2014-03-07 11:51 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 11:23 -0700
Re: Tuples and immutability Roy Smith <roy@panix.com> - 2014-03-07 08:37 -0500
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-08 15:17 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-07 21:17 -0700
Balanced trees (was: Re: Tuples and immutability) Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 10:34 +0200
Re: Balanced trees (was: Re: Tuples and immutability) Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 04:03 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 13:26 +0200
Re: Balanced trees (was: Re: Tuples and immutability) Dan Stromberg <drsalists@gmail.com> - 2014-03-08 11:58 -0800
Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-08 20:37 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-08 23:21 +0200
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-08 17:22 -0500
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:17 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-08 17:31 -0800
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:27 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:20 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:32 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 14:37 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 23:43 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-09 15:08 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 00:24 +0200
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 18:04 -0600
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 03:24 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 08:16 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 08:53 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 11:41 +0200
Re: Balanced trees Ned Batchelder <ned@nedbatchelder.com> - 2014-03-10 06:57 -0400
Re: Balanced trees Rustom Mody <rustompmody@gmail.com> - 2014-03-10 09:01 -0700
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 02:02 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:20 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 13:29 +1100
Re: Balanced trees Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-10 09:51 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 09:59 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:20 +1100
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:24 +1100
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:08 +0200
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 04:17 +1100
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-10 19:34 +0200
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-13 12:40 -0600
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-13 23:57 +0000
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-13 20:12 -0700
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-14 05:02 +0000
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 19:24 -0400
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 10:27 +1100
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 01:26 +0000
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 12:45 +1100
Re: Balanced trees Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 21:38 -0600
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 15:28 +1100
Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-12 00:57 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-11 12:12 +0200
Re: Balanced trees alex23 <wuwei23@gmail.com> - 2014-03-12 10:13 +1000
Re: Balanced trees Alister <alister.ware@ntlworld.com> - 2014-03-11 09:25 +0000
Re: Balanced trees Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2014-03-12 10:08 +0100
Re: Balanced trees Tim Chase <python.list@tim.thechases.com> - 2014-03-10 11:33 -0500
Re: Balanced trees Chris Angelico <rosuav@gmail.com> - 2014-03-11 03:39 +1100
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-10 18:05 -0700
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-10 22:13 -0400
Re: Balanced trees Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2014-03-10 19:57 -0400
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-15 01:13 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 00:05 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 12:26 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-18 22:55 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 14:45 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 00:03 +0200
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-18 15:21 -0700
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 01:11 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 10:49 +0200
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 13:42 +0000
Re: Balanced trees Marko Rauhamaa <marko@pacujo.net> - 2014-03-19 15:54 +0200
Re: Balanced trees Roy Smith <roy@panix.com> - 2014-03-19 10:06 -0400
Re: Balanced trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-19 01:15 +0000
Re: Balanced trees Ethan Furman <ethan@stoneleaf.us> - 2014-03-19 08:15 -0700
Re: Balanced trees "Rhodri James" <rhodri@wildebst.org.uk> - 2014-03-20 02:16 +0000
Re: Balanced trees Dan Stromberg <drsalists@gmail.com> - 2014-03-19 19:34 -0700
Re: Balanced trees Chris Kaynor <ckaynor@zindagigames.com> - 2014-03-18 18:02 -0700
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-18 13:18 -0700
blist in standard library (was Re: Balanced trees) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-15 12:31 +0000
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 14:16 -0700
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 00:08 +0000
Re: Balanced trees Daniel Stutzbach <stutzbach@google.com> - 2014-03-17 18:01 -0700
Re: Balanced trees Joshua Landau <joshua@landau.ws> - 2014-03-18 07:46 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:40 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:39 -0700
Re: Tuples and immutability Marko Rauhamaa <marko@pacujo.net> - 2014-03-09 11:35 +0200
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-10 11:03 +1300
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-09 19:00 -0400
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-09 17:42 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 02:37 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-10 02:35 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-10 09:13 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:15 +1300
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-11 18:03 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 04:39 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-11 16:46 +0000
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-12 10:23 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-11 17:06 -0600
Re: Tuples and immutability Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-12 23:20 +0000
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-12 19:35 -0600
Re: Tuples and immutability Terry Reedy <tjreedy@udel.edu> - 2014-03-12 22:09 -0400
Re: Tuples and immutability Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-03-09 13:45 +1300
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-08 19:55 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 16:22 -0800
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-01 02:27 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 20:45 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:34 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 21:50 -0800
Re: Tuples and immutability Ned Batchelder <ned@nedbatchelder.com> - 2014-03-01 12:56 -0500
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:26 -0700
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 12:39 +1100
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-02-28 22:16 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:16 -0800
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-01 17:29 +1100
Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-01 14:54 +0000
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:01 -0800
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-02-28 22:25 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-01 12:45 -0700
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 13:21 -0800
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 03:04 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:28 -0800
Re: Tuples and immutability Ian Kelly <ian.g.kelly@gmail.com> - 2014-03-02 05:32 -0700
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 14:38 +0100
Re: Tuples and immutability Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-02 14:05 +0000
Re: Tuples and immutability Eric Jacoboni <eric.jacoboni@gmail.com> - 2014-03-02 15:17 +0100
Re: Tuples and immutability "albert visser" <albert.visser@gmail.com> - 2014-03-02 15:37 +0100
Re: Tuples and immutability "Mark H. Harris" <harrismh777@gmail.com> - 2014-03-01 18:15 -0800
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 17:54 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 05:13 +1100
Re: Tuples and immutability Joshua Landau <joshua@landau.ws> - 2014-03-09 19:57 +0000
Re: Tuples and immutability Chris Angelico <rosuav@gmail.com> - 2014-03-10 07:06 +1100
Page 4 of 9 — ← Prev page 1 2 3 [4] 5 6 7 8 9 Next page →
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-03-10 22:20 -0400 |
| Subject | Re: Balanced trees |
| Message-ID | <roy-87AEA3.22201010032014@news.panix.com> |
| In reply to | #68190 |
In article <531e6eca$0$29994$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > There's only so much matter in the > universe, so talking about limits as the amount of data approaches > infinity is nonsense. Where would you store it? Funny you should ask that. I just finished watching https://en.wikipedia.org/wiki/11001001 a few minutes ago.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 13:29 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8035.1394517383.18130.python-list@python.org> |
| In reply to | #68192 |
On Tue, Mar 11, 2014 at 1:20 PM, Roy Smith <roy@panix.com> wrote: > In article <531e6eca$0$29994$c3e8da3$5496439d@news.astraweb.com>, > Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > >> There's only so much matter in the >> universe, so talking about limits as the amount of data approaches >> infinity is nonsense. Where would you store it? > > Funny you should ask that. I just finished watching > https://en.wikipedia.org/wiki/11001001 a few minutes ago. Multiple universes and/or sub-atomic monkeys. http://tools.ietf.org/html/rfc2795 ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-03-10 09:51 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.7989.1394445097.18130.python-list@python.org> |
| In reply to | #68131 |
On 10/03/2014 08:53, Steven D'Aprano wrote: > > In other words, I suspect you are trolling. > s/suspect/know/ , he didn't make captain of my dream team for nothing, you know :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-03-10 09:59 -0400 |
| Subject | Re: Balanced trees |
| Message-ID | <roy-BE3C76.09590110032014@news.panix.com> |
| In reply to | #68121 |
In article <87eh2atw6s.fsf@elektro.pacujo.net>,
Marko Rauhamaa <marko@pacujo.net> wrote:
> Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
>
> > Proof: I create a hash table that accepts unsigned bytes as keys, where
>
> The O(f(n)) notation has no meaning when n is limited.
>
> This thing is not just pedantry. The discussion was how a balanced tree
> fares in comparison with hash tables. A simplistic O(1) vs O(log n) was
> presented as a proof that balanced trees don't have a chance in practice
> or theory. I wasn't so easily convinced.
>
>
> Marko
On the other hand, log n, for n = 1 million, is just 20. It's not hard
to imagine a hash function which costs 20x what a node traversal does,
in which case, the log n lookup is ahead for all n < 1 million.
Looking at the Songza source, I see we have one user-defined hash
function:
def __hash__(self):
return hash((self.song,
self.user,
self.add_time,
self.delete_time,
self.play_first))
where those things are 2 bson.ObjectId's, 2 datetimes, and a boolean. I
wouldn't be surprised if that falls into the 20x node traversal bucket.
In this case, convenience was more important than speed, so it doesn't
matter.
The hash vs. tree argument can get very complicated. For example, if
your tree is not completely resident in memory, the cost of paging in a
node will swamp everything else, and improving lookup speed will boil
down to reducing the number of I/O operations you need. An expensive
hash plus a linear walk through a collision chain which was resident in
a single memory block will beat traversing two nodes, each of which had
to be paged in separately.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 03:20 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8006.1394468411.18130.python-list@python.org> |
| In reply to | #68143 |
On Tue, Mar 11, 2014 at 12:59 AM, Roy Smith <roy@panix.com> wrote: > The hash vs. tree argument can get very complicated. For example, if > your tree is not completely resident in memory, the cost of paging in a > node will swamp everything else, and improving lookup speed will boil > down to reducing the number of I/O operations you need. An expensive > hash plus a linear walk through a collision chain which was resident in > a single memory block will beat traversing two nodes, each of which had > to be paged in separately. Indeed, which is broadly an extension of the "cache locality" argument. I've never actually yearned for any of the advanced operations a tree can give (over a hash table). Usually, by the time I'm looking for that sort of thing, I really want an on-disk database - that solves all the problems of paging (the DBMS will make sure it reads in a minimum of index and data pages), and a good SQL database can handle multiple indexes in a space-efficient way. Plus, what you said about log 1,000,000? By the time you're looking at a million records, you probably (a) need to have them on disk somewhere anyway, and (b) don't want to read them all into RAM before you begin. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 03:24 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8007.1394468681.18130.python-list@python.org> |
| In reply to | #68143 |
On Tue, Mar 11, 2014 at 12:59 AM, Roy Smith <roy@panix.com> wrote: > Looking at the Songza source, I see we have one user-defined hash > function: > > def __hash__(self): > return hash((self.song, > self.user, > self.add_time, > self.delete_time, > self.play_first)) > > where those things are 2 bson.ObjectId's, 2 datetimes, and a boolean. I > wouldn't be surprised if that falls into the 20x node traversal bucket. > In this case, convenience was more important than speed, so it doesn't > matter. The only difference between a tree and a hash here is that the tree might be able to short-cut the comparisons. But if there are a whole bunch of songs with the same "song" and "user", then the tree has to compare (song->song? same; user->user? same; add_time->add_time? left/right) multiple times. So what you're saying is that the hash table has consistency but the tree could do better or could do worse. (Imagine, worst case, all one million records have the same song/user/add_time and you need to do twenty comparisons involving four fields. That's gotta be worse than one hashing of five fields.) ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-10 19:08 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <8761nmrnfk.fsf@elektro.pacujo.net> |
| In reply to | #68157 |
Chris Angelico <rosuav@gmail.com>: > The only difference between a tree and a hash here is that the tree > might be able to short-cut the comparisons. But if there are a whole > bunch of songs with the same "song" and "user", then the tree has to > compare (song->song? same; user->user? same; add_time->add_time? > left/right) multiple times. So what you're saying is that the hash > table has consistency but the tree could do better or could do worse. > (Imagine, worst case, all one million records have the same > song/user/add_time and you need to do twenty comparisons involving > four fields. That's gotta be worse than one hashing of five fields.) You are right that in the end there probably is an equality test involving all fields of the key. However, the intermediate comparisons are often dealt with much more immediately. For example, to compare the words "hello" and "world", you only need to look at the first letter. Anyway, this whole debate is rather unnecessary since every developer is supposed to have both weapons in their arsenal. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 04:17 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8013.1394471845.18130.python-list@python.org> |
| In reply to | #68163 |
On Tue, Mar 11, 2014 at 4:08 AM, Marko Rauhamaa <marko@pacujo.net> wrote: > Chris Angelico <rosuav@gmail.com>: > >> The only difference between a tree and a hash here is that the tree >> might be able to short-cut the comparisons. But if there are a whole >> bunch of songs with the same "song" and "user", then the tree has to >> compare (song->song? same; user->user? same; add_time->add_time? >> left/right) multiple times. So what you're saying is that the hash >> table has consistency but the tree could do better or could do worse. >> (Imagine, worst case, all one million records have the same >> song/user/add_time and you need to do twenty comparisons involving >> four fields. That's gotta be worse than one hashing of five fields.) > > You are right that in the end there probably is an equality test > involving all fields of the key. However, the intermediate comparisons > are often dealt with much more immediately. For example, to compare the > words "hello" and "world", you only need to look at the first letter. Right, which is what I meant about the consistency of a hash table. You know, with a hash table, that you're going to need to calculate the hash across everything. The tree lets you cheaply add disambiguation fields, knowing they'll be used only if they're needed. Oddly enough, the consistency in design is inverted by the predictability under attack. If an attacker might be able to generate hash collisions, the tree (assuming it's self-balancing) will have its worst-case much closer to its average/typical cases, which means the tree is more consistent. Strange how that goes, sometimes. > Anyway, this whole debate is rather unnecessary since every developer is > supposed to have both weapons in their arsenal. Supposed to have? What does that mean, a language isn't ISO-compliant unless it provides both? With a high level language like Python, using the provided hash table will almost always cream any hand-built tree, no matter how advantageous the data is to the tree. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-10 19:34 +0200 |
| Subject | Re: Balanced trees |
| Message-ID | <87zjkyq7nr.fsf@elektro.pacujo.net> |
| In reply to | #68164 |
Chris Angelico <rosuav@gmail.com>: > Supposed to have? What does that mean, a language isn't ISO-compliant > unless it provides both? It's an ancient, fundamental data structure, right up there with dynamic lists. There's no reason it shouldn't be available in every programming environment. > With a high level language like Python, using the provided hash table > will almost always cream any hand-built tree, no matter how > advantageous the data is to the tree. The main thing is there are use cases where order is essential. That's why I have had to implement the AVL tree in Python myself. No biggie, but a C implementation would probably be much faster. Also, a standard version would likely be reviewed and tested better and have all Pythonic accessors in place. Marko
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-03-13 12:40 -0600 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8133.1394736421.18130.python-list@python.org> |
| In reply to | #68166 |
On Mon, Mar 10, 2014 at 11:34 AM, Marko Rauhamaa <marko@pacujo.net> wrote: > The main thing is there are use cases where order is essential. That's > why I have had to implement the AVL tree in Python myself. No biggie, > but a C implementation would probably be much faster. Also, a standard > version would likely be reviewed and tested better and have all Pythonic > accessors in place. That last is actually quite easy to achieve, especially using the built-in ABCs. Just subclass from collections.MutableMapping and implement __len__, __iter__, __getitem__, __setitem__ and __delitem__; and default implementations of the rest will be mixed in for you. Or you can override those with optimized implementations.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-03-13 23:57 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <532245fa$0$29994$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #68166 |
On Mon, 10 Mar 2014 19:34:48 +0200, Marko Rauhamaa wrote: >> With a high level language like Python, using the provided hash table >> will almost always cream any hand-built tree, no matter how >> advantageous the data is to the tree. > > The main thing is there are use cases where order is essential. That's > why I have had to implement the AVL tree in Python myself. from collections import OrderedDict gives you a fast, ordered mapping. In this case, the order is that of insertion order. If you want sort order instead, for "small" amounts of data, say below a million or ten million items, you're likely to have acceptable if not superior results by just sorting them items when and as needed. Otherwise, I expect that following the basic technique of OrderedDict, and building your data structure from a dict and an associated list, keeping the two in sync, will be faster than a pure Python implementation of an AVL tree. But of course only testing it will make that clear. http://code.activestate.com/recipes/576693-ordered-dictionary-for-py24/ Modifying the above recipe to keep items in something other than insertion order is left as an exercise. -- Steven D'Aprano http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2014-03-13 20:12 -0700 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8142.1394766770.18130.python-list@python.org> |
| In reply to | #68344 |
On Thu, Mar 13, 2014 at 4:57 PM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Mon, 10 Mar 2014 19:34:48 +0200, Marko Rauhamaa wrote: > >>> With a high level language like Python, using the provided hash table >>> will almost always cream any hand-built tree, no matter how >>> advantageous the data is to the tree. >> >> The main thing is there are use cases where order is essential. That's >> why I have had to implement the AVL tree in Python myself. > > from collections import OrderedDict > > gives you a fast, ordered mapping. In this case, the order is that of > insertion order. If you want sort order instead, for "small" amounts of > data, say below a million or ten million items, you're likely to have > acceptable if not superior results by just sorting them items when and as > needed. This is one of my pet things. Sorting the same (slightly tweaked) data inside of a tight loop is rarely a good idea - despite the fact that the "sort" itself tends to be O(n) with Python's rather awesome builtin sorting algorithm. This is because sorting inside a loop tends to yield O(n^2) algorithms, or even O((n^2)*logn). But if you're sorting once at the end of whatever other processing, sorting is great. It's (still) O(nlogn), but it's got a terrific constant.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-03-14 05:02 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <53228d5d$0$29994$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #68354 |
On Thu, 13 Mar 2014 20:12:41 -0700, Dan Stromberg wrote: > Sorting the same (slightly tweaked) data inside of a tight loop is > rarely a good idea - despite the fact that the "sort" itself tends to be > O(n) with Python's rather awesome builtin sorting algorithm. This is > because sorting inside a loop tends to yield O(n^2) algorithms, or even > O((n^2)*logn). I agree with you except for the word "rarely". It depends on how much data you have, and in my experience most people think that N = 100 is a lot :-) Remember that Big Oh doesn't say anything about the constant of proportionality. If you have one function that behaves like O(N) but with a constant factor of 1000, and another function that behaves like O(N**2) with a constant factor of 0.001, the second, theoretically slower, function will actually be faster for N < one million. So *in practice*, a solution that involves a theoretically worse Big Oh function but a significantly lower constant factor may turn out to be better for all the values you care about. Given the difference in speed between Python's built-ins, and the code you can write yourself in pure Python, it is normally better to push as much work as you can onto the built-ins. And of course, we should not make assumptions as to what is faster and what is slower. After 15+ years, I'm still surprised about what ends up being faster in Python. > But if you're sorting once at the end of whatever other processing, > sorting is great. It's (still) O(nlogn), but it's got a terrific > constant. Exactly! -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-03-10 19:24 -0400 |
| Subject | Re: Balanced trees |
| Message-ID | <roy-F30152.19240710032014@news.panix.com> |
| In reply to | #68163 |
In article <8761nmrnfk.fsf@elektro.pacujo.net>, Marko Rauhamaa <marko@pacujo.net> wrote: > Anyway, this whole debate is rather unnecessary since every developer is > supposed to have both weapons in their arsenal. The problem with having a choice is that it opens up the possibility of making the wrong one :-) As this discussion has shown, figuring out whether a hash table or a tree is better for a given problem is non-trivial. My guess is that if you gave 1000 typical developers both data structures and let them pick freely, the number of cases where it really mattered and the developer picked the right one would be approximately equal to the number of cases where they picked the wrong one.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 10:27 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8025.1394494024.18130.python-list@python.org> |
| In reply to | #68179 |
On Tue, Mar 11, 2014 at 10:24 AM, Roy Smith <roy@panix.com> wrote: > As this discussion has shown, figuring out whether a hash table or a > tree is better for a given problem is non-trivial. My guess is that if > you gave 1000 typical developers both data structures and let them pick > freely, the number of cases where it really mattered and the developer > picked the right one would be approximately equal to the number of cases > where they picked the wrong one. And both would be utterly dwarfed by the cases where it doesn't matter, and everyone's paying a cost (having to choose) for no benefit. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-03-11 01:26 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <531e6643$0$29994$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #68179 |
On Mon, 10 Mar 2014 19:24:07 -0400, Roy Smith wrote: > In article <8761nmrnfk.fsf@elektro.pacujo.net>, > Marko Rauhamaa <marko@pacujo.net> wrote: > >> Anyway, this whole debate is rather unnecessary since every developer >> is supposed to have both weapons in their arsenal. > > The problem with having a choice is that it opens up the possibility of > making the wrong one :-) > > As this discussion has shown, figuring out whether a hash table or a > tree is better for a given problem is non-trivial. My guess is that if > you gave 1000 typical developers both data structures and let them pick > freely, the number of cases where it really mattered and the developer > picked the right one would be approximately equal to the number of cases > where they picked the wrong one. You're very optimistic. In my experience, the average developer has an amazing talent for pessimising code when they think they are optimising it. -- Steven D'Aprano http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 12:45 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8032.1394502844.18130.python-list@python.org> |
| In reply to | #68187 |
On Tue, Mar 11, 2014 at 12:26 PM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > In my experience, the average developer has an amazing talent for > pessimising code when they think they are optimising it. I remember a number of incidents from personal experience when I was a *very* average developer. One time, writing C++ code, I looked at the disassembly and decided the compiler was doing a terrible job. No no, I could make this so much better by using the 80x86 "REP MOVSW" command (or commands, depending on your point of view). That would be so much better than all those separate operations the silly compiler was doing! Roughly an hour of fiddling later, making sure it all still worked correctly, I discover that... hmm, it's not actually any faster. Turns out the 80x86 string opcodes are really inefficient; they're short (a one-byte command that says "read a byte/word/doubleword from DS:SI, write it to ES:DI, and increment or decrement SI and DI"), but not fast. In my defense, I at least did measure before-and-after, and learned that I should back out that change :) ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-03-10 21:38 -0600 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8033.1394509126.18130.python-list@python.org> |
| In reply to | #68187 |
On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com> wrote: > On Tue, Mar 11, 2014 at 12:26 PM, Steven D'Aprano > <steve+comp.lang.python@pearwood.info> wrote: >> In my experience, the average developer has an amazing talent for >> pessimising code when they think they are optimising it. > > I remember a number of incidents from personal experience when I was a > *very* average developer. One time, writing C++ code, I looked at the > disassembly and decided the compiler was doing a terrible job. No no, > I could make this so much better by using the 80x86 "REP MOVSW" > command (or commands, depending on your point of view). That would be > so much better than all those separate operations the silly compiler > was doing! Roughly an hour of fiddling later, making sure it all still > worked correctly, I discover that... hmm, it's not actually any > faster. Turns out the 80x86 string opcodes are really inefficient; > they're short (a one-byte command that says "read a > byte/word/doubleword from DS:SI, write it to ES:DI, and increment or > decrement SI and DI"), but not fast. In my defense, I at least did > measure before-and-after, and learned that I should back out that > change :) Better to have tried and failed though than to have simply accepted what the compiler was doing with no verification at all.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-11 15:28 +1100 |
| Subject | Re: Balanced trees |
| Message-ID | <mailman.8034.1394512113.18130.python-list@python.org> |
| In reply to | #68187 |
On Tue, Mar 11, 2014 at 2:38 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com> wrote: >> No no, >> I could make this so much better by using the 80x86 "REP MOVSW" >> command (or commands, depending on your point of view). That would be >> so much better than all those separate operations the silly compiler >> was doing! Roughly an hour of fiddling later, making sure it all still >> worked correctly, I discover that... hmm, it's not actually any >> faster. > > Better to have tried and failed though than to have simply accepted > what the compiler was doing with no verification at all. Maybe. But I've learned now that one guy who used to do assembly language programming on an 8086 is unlikely to discover something that the many authors of a C compiler haven't noticed. Yes, it's possible there'll be something specific to my code, like if I'm doing a strcpy-like operation that isn't *actually* strcpy (the function will be optimized heavily, but a C-level loop might not be recognized), but it's more likely the compiler knows better than I do. That, by the way, was before I realized that *interpreter* writers are more expert than I am, too, and therefore that I can trust a heavily-optimized high level language to run my code faster than I could write equivalent C. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "Rhodri James" <rhodri@wildebst.org.uk> |
|---|---|
| Date | 2014-03-12 00:57 +0000 |
| Subject | Re: Balanced trees |
| Message-ID | <op.xck3mxcw5079vu@gnudebeest> |
| In reply to | #68195 |
On Tue, 11 Mar 2014 04:28:25 -0000, Chris Angelico <rosuav@gmail.com> wrote: > On Tue, Mar 11, 2014 at 2:38 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote: >> On Mon, Mar 10, 2014 at 7:45 PM, Chris Angelico <rosuav@gmail.com> >> wrote: >>> No no, >>> I could make this so much better by using the 80x86 "REP MOVSW" >>> command (or commands, depending on your point of view). That would be >>> so much better than all those separate operations the silly compiler >>> was doing! Roughly an hour of fiddling later, making sure it all still >>> worked correctly, I discover that... hmm, it's not actually any >>> faster. >> >> Better to have tried and failed though than to have simply accepted >> what the compiler was doing with no verification at all. > > Maybe. But I've learned now that one guy who used to do assembly > language programming on an 8086 is unlikely to discover something that > the many authors of a C compiler haven't noticed. Yes, it's possible > there'll be something specific to my code, like if I'm doing a > strcpy-like operation that isn't *actually* strcpy (the function will > be optimized heavily, but a C-level loop might not be recognized), but > it's more likely the compiler knows better than I do. That might be true for x86, but it isn't true for ARM for example. Apparently it's algorithmically hard to generate ARM code that makes efficient use of conditional instructions, and I can almost always write tighter, faster code than the compiler generates in consequence. It's not always worth the extra coding time (longer now that I'm not in practise), but that's a separate matter. -- Rhodri James *-* Wildebeest Herder to the Masses
[toc] | [prev] | [next] | [standalone]
Page 4 of 9 — ← Prev page 1 2 3 [4] 5 6 7 8 9 Next page →
Back to top | Article view | comp.lang.python
csiph-web