Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #84027 > unrolled thread

Trees

Started byZachary Gilmartin <zacharygilmartin@gmail.com>
First post2015-01-19 17:06 -0500
Last post2015-01-20 10:02 -0800
Articles 7 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Trees Zachary Gilmartin <zacharygilmartin@gmail.com> - 2015-01-19 17:06 -0500
    Re: Trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-20 10:08 +1100
      Re: Trees Michael Torrie <torriem@gmail.com> - 2015-01-19 16:19 -0700
      Re: Trees Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-01-19 15:52 -0800
      Re: Trees Tim Chase <python.list@tim.thechases.com> - 2015-01-19 18:00 -0600
      Re: Trees Nicholas Cole <nicholas.cole@gmail.com> - 2015-01-20 09:23 +0000
      Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-20 10:02 -0800

#84027 — Trees

FromZachary Gilmartin <zacharygilmartin@gmail.com>
Date2015-01-19 17:06 -0500
SubjectTrees
Message-ID<mailman.17862.1421705173.18130.python-list@python.org>

[Multipart message — attachments visible in raw view] — view raw

Why aren't there trees in the python standard library?

[toc] | [next] | [standalone]


#84031

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-01-20 10:08 +1100
Message-ID<54bd8e6a$0$13009$c3e8da3$5496439d@news.astraweb.com>
In reply to#84027
Zachary Gilmartin wrote:

> Why aren't there trees in the python standard library?

Possibly because they aren't needed? Under what circumstances would you use
a tree instead of a list or a dict or combination of both?

That's not a rhetorical question. I am genuinely curious, what task do you
have that you think must be solved by a tree?

Also, what sort of tree? Binary tree? Binary search tree? Red/black tree?
AVL tree? Splay tree? B-tree? T-tree? Scapegoat tree? General n-ary tree?
Every possible type of tree yet invented?



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#84032

FromMichael Torrie <torriem@gmail.com>
Date2015-01-19 16:19 -0700
Message-ID<mailman.17866.1421709608.18130.python-list@python.org>
In reply to#84031
On 01/19/2015 04:08 PM, Steven D'Aprano wrote:
> Zachary Gilmartin wrote:
> 
>> Why aren't there trees in the python standard library?
> 
> Possibly because they aren't needed? Under what circumstances would you use
> a tree instead of a list or a dict or combination of both?
> 
> That's not a rhetorical question. I am genuinely curious, what task do you
> have that you think must be solved by a tree?
> 
> Also, what sort of tree? Binary tree? Binary search tree? Red/black tree?
> AVL tree? Splay tree? B-tree? T-tree? Scapegoat tree? General n-ary tree?
> Every possible type of tree yet invented?

Don't forget left-child,right-sibling trees.

As I go through your list of trees, I can't find any tree type that
cannot be easily and efficiently constructed with lists, possibly with
dicts.

[toc] | [prev] | [next] | [standalone]


#84033

FromDevin Jeanpierre <jeanpierreda@gmail.com>
Date2015-01-19 15:52 -0800
Message-ID<mailman.17867.1421711603.18130.python-list@python.org>
In reply to#84031
On Mon, Jan 19, 2015 at 3:08 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Zachary Gilmartin wrote:
>
>> Why aren't there trees in the python standard library?
>
> Possibly because they aren't needed? Under what circumstances would you use
> a tree instead of a list or a dict or combination of both?
>
> That's not a rhetorical question. I am genuinely curious, what task do you
> have that you think must be solved by a tree?

In general, any time you want to maintain a sorted list or mapping,
balanced search tree structures come in handy.

Here's an example task: suppose you want to represent a calendar,
where timeslots can be reserved for something. Calendar events are not
allowed to intersect.

The most important query is: What events are there that intersect with
the timespan between datetimes d1 and d2? (To draw a daily agenda,
figure out if you should display an alert to the user that an event is
ongoing or imminent, etc.)

You also want to be able to add a new event to the calendar, that
takes place between d1 and d2, and to remove a event.

I leave it to the reader to implement this using a sorted map. (hint:
sort by start.)

This maybe seems contrived, but I've used this exact datatype, or a
remarkably similar one, in a few different circumstances: sequenced
actions of characters in a strategy game, animation, motion
planning...

There are a few possible implementations using Python data structures.
You can do it using a linear scan, which gets a little slow pretty
quickly. You can make insertion slow (usually OK) by sorting on
insertion, but if you ever forget to resort your list you will get a
subtle bug you might not notice for a while. And so on. It's better in
every way to use the third-party blist module, so why bother?

-- Devin

[toc] | [prev] | [next] | [standalone]


#84034

FromTim Chase <python.list@tim.thechases.com>
Date2015-01-19 18:00 -0600
Message-ID<mailman.17868.1421712397.18130.python-list@python.org>
In reply to#84031
On 2015-01-19 16:19, Michael Torrie wrote:
> On 01/19/2015 04:08 PM, Steven D'Aprano wrote:
> > Zachary Gilmartin wrote:
> >> Why aren't there trees in the python standard library?
> > 
> > Possibly because they aren't needed? Under what circumstances
> > would you use a tree instead of a list or a dict or combination
> > of both?

While not an abundance of times, I've had a plurality of occasions
when I've wanted characteristics of a red/black tree where I wanted
O(log n) insertion/deletion/initial-access and O(1) access to
neighbors.

> > Also, what sort of tree? Binary tree? Binary search tree?
> > Red/black tree? AVL tree? Splay tree? B-tree? T-tree? Scapegoat
> > tree? General n-ary tree? Every possible type of tree yet
> > invented?
> 
> Don't forget left-child,right-sibling trees.
> 
> As I go through your list of trees, I can't find any tree type that
> cannot be easily and efficiently constructed with lists, possibly
> with dicts.

While the data-structures can easily be composed out of lists/dicts,
the algorithms that go along with them (such as red/black trees with
their insertion/deletion gymnastics) would be nice to have in the
stdlib so they wouldn't have to be reimplemented (or sought out and
downloaded, and added to the configuration/distribution) every time
someone wanted that particular tree's characteristics.   Much of that
could be addressed with a library of algorithms much like heapq
operates on a list.

-tkc

[toc] | [prev] | [next] | [standalone]


#84062

FromNicholas Cole <nicholas.cole@gmail.com>
Date2015-01-20 09:23 +0000
Message-ID<mailman.17887.1421745832.18130.python-list@python.org>
In reply to#84031
On Mon, Jan 19, 2015 at 11:52 PM, Devin Jeanpierre
<jeanpierreda@gmail.com> wrote:
> On Mon, Jan 19, 2015 at 3:08 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> Zachary Gilmartin wrote:
>>
>>> Why aren't there trees in the python standard library?
>>
>> Possibly because they aren't needed? Under what circumstances would you use
>> a tree instead of a list or a dict or combination of both?
>>
>> That's not a rhetorical question. I am genuinely curious, what task do you
>> have that you think must be solved by a tree?
>
> In general, any time you want to maintain a sorted list or mapping,
> balanced search tree structures come in handy.
>
> Here's an example task: suppose you want to represent a calendar,
> where timeslots can be reserved for something. Calendar events are not
> allowed to intersect.

Maybe because I'm not a computer scientist, I can't immediately see
why this is a "Tree" problem and not a "Database" problem.  Genuinely
interested.

[toc] | [prev] | [next] | [standalone]


#84077

FromPaul Rubin <no.email@nospam.invalid>
Date2015-01-20 10:02 -0800
Message-ID<87bnltbaae.fsf@jester.gateway.sonic.net>
In reply to#84031
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
> Possibly because they aren't needed? Under what circumstances would
> you use a tree instead of a list or a dict or combination of both?

I've sometimes wanted a functional tree in the sense of functional
programming.  That means the tree structure is immutable and you insert
or delete nodes in O(log n) operations, by creating new trees that share
almost all their data with the old tree.  Once you release the old root,
garbage collection frees any nodes that were in the old tree but not the
new one.

Use case inspired by Haskell's Happstack-store (formerly called MACID):
you want a Prevayler-style in-memory key-value database.  First idea:
all read queries are served by pure in-memory lookups in a Python dict.
Write queries update the dict and also append the command to a serial
log on disk (one disk write, no seeks).  If the system crashes, restart
it empty, and play back the log from the beginning to restore the
in-memory data.

Problem with first idea: the database might be a few thousand entries
but take millions of updates, if entries are modified frequently.  You
don't want to reload the million updates from the beginning.  Next idea:
dump a snapshot of the dictionary now and then, and then just reload
updates starting from the last snapshot.  Trouble here is that the
dictionary is big enough that writing out the snapshot takes time, and
you don't want to lock the whole dict against more updates while the
snapshot is writing.  If you do this the Java way, welcome to
concurrency hell as you make finer and finer grained locks and deal with
resulting deadlocks, race conditions, etc.  And the dictionary still has
to be synchronized to avoid reads simultaneous with updates.

MACID solution: use a functional red-black tree instead of a dict.  To
add or update an entry, make a new tree that replaces the old tree by
updating a single pointer.  That allows unlimited read concurrency
(reading means getting the current tree pointer and navigating the tree
that you get: since that tree is immutable you can keep using it while
other threads make new trees).  Updates are managed by a single thread
that can take its time making the new tree and writing the log, and then
atomically updating the in-memory root pointer that's shared with other
threads.  Finally, to take a snapshot, you can just get the current root
and write out its contents: since it's immutable you can do that
concurrently with anything else (including updates) that might be
happening.  You just have a small interaction with the update thread to
remember where in the log to start reading from in case of a crash and
reload.

Finding out about this was one of the "wow" moments that made me realize
I had to learn Haskell.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web