Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #84060 > unrolled thread

Re: Trees

Started byTerry Reedy <tjreedy@udel.edu>
First post2015-01-20 01:08 -0500
Last post2015-01-21 14:05 +0000
Articles 20 on this page of 37 — 11 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Trees Terry Reedy <tjreedy@udel.edu> - 2015-01-20 01:08 -0500
    Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 11:45 +0200
      Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-20 10:14 -0800
        Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 22:26 +0200
      Re: Trees Stephen Hansen <me+python@ixokai.io> - 2015-01-20 23:56 -0800
        Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-21 10:35 +0200
        Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 04:09 -0800
          Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-21 23:35 +1100
            Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 07:24 -0800
          Re: Trees Tim Chase <python.list@tim.thechases.com> - 2015-01-21 06:55 -0600
          Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-22 00:01 +1100
          Re: Trees Tim Chase <python.list@tim.thechases.com> - 2015-01-21 08:26 -0600
          Re: Trees Chris Angelico <rosuav@gmail.com> - 2015-01-22 01:31 +1100
          Re: Trees Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-22 01:47 +1100
            Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 09:15 -0700
            Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 10:27 -0700
    Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 05:33 -0800
      Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 05:51 -0800
      Re: Trees Marko Rauhamaa <marko@pacujo.net> - 2015-01-20 16:15 +0200
        Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 06:35 -0800
      Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-20 10:19 -0700
        Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 10:15 -0800
          Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 10:35 -0800
      Re: Trees Mario <marfig@gmail.com> - 2015-01-20 22:47 +0100
        Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 17:23 -0800
          Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-20 17:49 -0800
            Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-20 18:03 -0800
              Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-21 14:27 -0800
                Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 21:17 -0800
          Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 15:54 -0700
            Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-21 21:20 -0800
              Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-22 00:01 -0700
              Re: Trees Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-21 23:56 -0700
            Re: Trees Paul Rubin <no.email@nospam.invalid> - 2015-01-21 23:16 -0800
              Re: Trees Rustom Mody <rustompmody@gmail.com> - 2015-01-22 08:54 -0800
        Re: Trees Terry Reedy <tjreedy@udel.edu> - 2015-01-20 21:19 -0500
          Re: Trees Mario Figueiredo <marfig@gmail.com> - 2015-01-21 14:05 +0000

Page 1 of 2  [1] 2  Next page →


#84060 — Re: Trees

FromTerry Reedy <tjreedy@udel.edu>
Date2015-01-20 01:08 -0500
SubjectRe: Trees
Message-ID<mailman.17884.1421734095.18130.python-list@python.org>
On 1/19/2015 5:06 PM, Zachary Gilmartin wrote:
> Why aren't there trees in the python standard library?

Sequences nested withing sequences can be regarded as trees, and Python 
has these.  I regard Lisp as a tree processing languages, as it must be 
to manipulate, for example, code with nested structures.

Nested structures in general represent trees or more general graph 
forms.  The ast module uses Nodes with lists of subnodes to represent 
Abstract Syntax Trees.

Priority queues are specialized trees implemented on top of lists.  The 
heapq module is imported into 6 other modules.

Others have answered as to why other special-purpose 
constrained-structure trees have not been added to the stdlib.

It might possibly be useful to add a general Tree base class with pre-, 
in-, and post-order generators, and maybe interior and leaf node 
counters and max depth method.  Finding uses for such a thing in the 
stdlib (ast?) would be a plus in favor.

-- 
Terry Jan Reedy

[toc] | [next] | [standalone]


#84063

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-01-20 11:45 +0200
Message-ID<87ppa9944t.fsf@elektro.pacujo.net>
In reply to#84060
Terry Reedy <tjreedy@udel.edu>:

> Others have answered as to why other special-purpose
> constrained-structure trees have not been added to the stdlib.

Ordered O(log n) mappings are not special-purpose data structures. I'd
say strings and floats are much more special-purpose than ordered
mappings, and yet Python has direct support for those.

As I said, I use ordered mappings to implement timers. GvR uses the
heapq module for the purpose. The downside of heapq is that canceled
timers often flood the heapq structure; in networking you can easily
start a thousand timers a second but virtually no timer ever expires.
Python's asyncio ignores the problem, but GvR mentioned a periodic
"garbage collection" as a potentially effective solution. I tested the
garbage collection trick and it did seem very effective.


Marko

[toc] | [prev] | [next] | [standalone]


#84080

FromPaul Rubin <no.email@nospam.invalid>
Date2015-01-20 10:14 -0800
Message-ID<877fwhb9pk.fsf@jester.gateway.sonic.net>
In reply to#84063
Marko Rauhamaa <marko@pacujo.net> writes:
> As I said, I use ordered mappings to implement timers... The downside
> of heapq is that canceled timers often flood the heapq structure...,
> GvR mentioned a periodic "garbage collection" as a potentially
> effective solution. 

You could look up the "timer wheel" approach used by the Linux kernel
and by Erlang.  It's less general than an ordered map, but probably
faster in practice.

  https://lkml.org/lkml/2005/10/19/46 

Has some info.  I think the kernel uses a different method now though.

Fwiw, I believe that the Haskell i/o manager uses an ordered map, and it
gets monstrous performance:

http://haskell.cs.yale.edu/wp-content/uploads/2013/08/hask035-voellmy.pdf

Some comments:
http://ezyang.tumblr.com/post/62152700458/haskell-mio-a-high-performance-multicore-io

[toc] | [prev] | [next] | [standalone]


#84089

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-01-20 22:26 +0200
Message-ID<8761c1i4fu.fsf@elektro.pacujo.net>
In reply to#84080
Paul Rubin <no.email@nospam.invalid>:

> You could look up the "timer wheel" approach used by the Linux kernel
> and by Erlang.  It's less general than an ordered map, but probably
> faster in practice.
>
>   https://lkml.org/lkml/2005/10/19/46 
>
> Has some info.  I think the kernel uses a different method now though.

I haven't followed it closely, but I believe the realtime timers use a
red-black tree.


Marko

[toc] | [prev] | [next] | [standalone]


#84107

FromStephen Hansen <me+python@ixokai.io>
Date2015-01-20 23:56 -0800
Message-ID<mailman.17906.1421827045.18130.python-list@python.org>
In reply to#84063

[Multipart message — attachments visible in raw view] — view raw

On Tue, Jan 20, 2015 at 1:45 AM, Marko Rauhamaa <marko@pacujo.net> wrote:

> Terry Reedy <tjreedy@udel.edu>:
>
> > Others have answered as to why other special-purpose
> > constrained-structure trees have not been added to the stdlib.
>
> Ordered O(log n) mappings are not special-purpose data structures. I'd
> say strings and floats are much more special-purpose than ordered
> mappings, and yet Python has direct support for those.
>

Your anecdote is strong, sir.

However, I use strings thousands of times, floats maybe a hundred of times,
and order mappings a few times.

My anecdote counters yours.

A tree structure is special purpose because there is a lot of options with
different characteristics that make certain implementations ideal in some
cases and not in others.

A float is a float, there's a standard (IEEE 754?), its not special at all.

A string, I suppose, could be special, but that's a pretty nonsense view of
the world since what most people use strings commonly.

I'm not arguing against including a tree, but I have no advice on which
one, and the one-- one!-- time I've needed a tree I got one off pypi. Not
everything needs to be in the stdlib.

But to call strings and floats special purpose is really a silly argument
to make.

[toc] | [prev] | [next] | [standalone]


#84109

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-01-21 10:35 +0200
Message-ID<87bnls8raa.fsf@elektro.pacujo.net>
In reply to#84107
Stephen Hansen <me+python@ixokai.io>:

> On Tue, Jan 20, 2015 at 1:45 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> Terry Reedy <tjreedy@udel.edu>:
>> > Others have answered as to why other special-purpose
>> > constrained-structure trees have not been added to the stdlib.
>>
>> Ordered O(log n) mappings are not special-purpose data structures. I'd
>> say strings and floats are much more special-purpose than ordered
>> mappings, and yet Python has direct support for those.
>
> [...]
>
> A string, I suppose, could be special, but that's a pretty nonsense view of
> the world since what most people use strings commonly.

I was reacting to the specific argument that has been made often in
Python circles: ordered mappings are special-purpose. That argument has
nothing to do with how often you need the data structure. It has to do
with how wide an abtract niche the data structure takes in the idea
space.

Now, you could make an argument that there's barely ever a practical
need for ordered mappings (whether right or wrong), and that would
elicit a different kind of response from me.

These and other arguments factor into whether the Python standard
library should contain a feature, ultimately it's GvR's arbitrary call.
He said you should first let it ripen in pypi and gain widespread
traction.


Marko

[toc] | [prev] | [next] | [standalone]


#84112

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-21 04:09 -0800
Message-ID<b37db482-1c2d-4b17-9b0b-0e37537482ef@googlegroups.com>
In reply to#84107
On Wednesday, January 21, 2015 at 1:27:39 PM UTC+5:30, Stephen Hansen wrote:
> On Tue, Jan 20, 2015 at 1:45 AM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> Terry Reedy <tjr...@udel.edu>:
> 
> 
> 
> > Others have answered as to why other special-purpose
> 
> > constrained-structure trees have not been added to the stdlib.
> 
> 
> 
> Ordered O(log n) mappings are not special-purpose data structures. I'd
> 
> say strings and floats are much more special-purpose than ordered
> 
> mappings, and yet Python has direct support for those.
> 
> 
> 
> Your anecdote is strong, sir.
> 
> 
> However, I use strings thousands of times, floats maybe a hundred of times, and order mappings a few times.
> 
> 
> My anecdote counters yours.
> 
> 
> A tree structure is special purpose because there is a lot of options with different characteristics that make certain implementations ideal in some cases and not in others.
> 
> 
> A float is a float, there's a standard (IEEE 754?), its not special at all.
> 
> 
> A string, I suppose, could be special, but that's a pretty nonsense view of the world since what most people use strings commonly. 
> 
> 
> I'm not arguing against including a tree, but I have no advice on which one, and the one-- one!-- time I've needed a tree I got one off pypi. Not everything needs to be in the stdlib.
> 
> 
> But to call strings and floats special purpose is really a silly argument to make. 

Since we are trading anecdotes…

Among my teachers of CS, there were two – both brilliant — one taught me
Numerical Analysis, the other taught me programming.

[As it happens my initial practical programming was the numerical kind
because the Fortran compiler was more serviceable than the Pascal one.
But thats a different point]

The point is that the two of them disagreed strongly on what programming was
about.

The numerical analyst could see no earthly reason to use anything other than
Fortran.

The programmer had horror stories to tell about how Fortran damages the brain:
eg programs that inherently need a while-loop (eg binary search) seem to be
distinctly out of the reach of the Fortran programmer.  Recursion...
And much else.

The numerical analyst of course had no use for this philosophizing.

The view that strings and floats are more fundamental than containers and maps
reminds me of this view.

For me python is neat because I can write: [1,2,3]
when I want a list.

But it does not go nearly far enough.

I would like a set to be {1,2,3} or at worst ⦃1,2,3⦄
and a bag to be ⟅1,2,3⟆

Apart from the unicode niceness that Ive described here
http://blog.languager.org/2014/04/unicoded-python.html

Its a bit of a nuisance that we have to write set([1,2,3]) for the first

More irksome that for the second we've to preface with

from collections import Counter

And still more a PITA that a straightforward standard name like bag (or multiset)
is called by such an ungoogleable misleading name as counter.

[toc] | [prev] | [next] | [standalone]


#84113

FromChris Angelico <rosuav@gmail.com>
Date2015-01-21 23:35 +1100
Message-ID<mailman.17910.1421843755.18130.python-list@python.org>
In reply to#84112
On Wed, Jan 21, 2015 at 11:09 PM, Rustom Mody <rustompmody@gmail.com> wrote:
> I would like a set to be {1,2,3} or at worst ⦃1,2,3⦄
> and a bag to be ⟅1,2,3⟆
>
> Apart from the unicode niceness that Ive described here
> http://blog.languager.org/2014/04/unicoded-python.html
>
> Its a bit of a nuisance that we have to write set([1,2,3]) for the first

Wait, what?

rosuav@sikorsky:~$ python
Python 2.7.3 (default, Mar 13 2014, 11:03:55)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type({1,2,3})
<type 'set'>
>>>
rosuav@sikorsky:~$ python3
Python 3.5.0a0 (default:4709290253e3, Jan 20 2015, 21:48:07)
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> type({1,2,3})
<class 'set'>
>>>

Looks like {1,2,3} works for me.

ChrisA

[toc] | [prev] | [next] | [standalone]


#84127

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-21 07:24 -0800
Message-ID<1cf6ca5d-67f9-451a-b1da-62b3b8a3f0cf@googlegroups.com>
In reply to#84113
On Wednesday, January 21, 2015 at 6:06:06 PM UTC+5:30, Chris Angelico wrote:
> On Wed, Jan 21, 2015 at 11:09 PM, Rustom Mody  wrote:
> > I would like a set to be {1,2,3} or at worst ⦃1,2,3⦄
> > and a bag to be ⟅1,2,3⟆
> >
> > Apart from the unicode niceness that Ive described here
> > http://blog.languager.org/2014/04/unicoded-python.html
> >
> > Its a bit of a nuisance that we have to write set([1,2,3]) for the first
> 
> Wait, what?
> 
> rosuav@sikorsky:~$ python
> Python 2.7.3 (default, Mar 13 2014, 11:03:55)
> [GCC 4.7.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> type({1,2,3})
> <type 'set'>
> >>>
> rosuav@sikorsky:~$ python3
> Python 3.5.0a0 (default:4709290253e3, Jan 20 2015, 21:48:07)
> [GCC 4.7.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> type({1,2,3})
> <class 'set'>
> >>>
> 
> Looks like {1,2,3} works for me.
> 
> ChrisA

Ah thank you -- forgot -- mea culpa.

[toc] | [prev] | [next] | [standalone]


#84114

FromTim Chase <python.list@tim.thechases.com>
Date2015-01-21 06:55 -0600
Message-ID<mailman.17911.1421844798.18130.python-list@python.org>
In reply to#84112
On 2015-01-21 23:35, Chris Angelico wrote:
> On Wed, Jan 21, 2015 at 11:09 PM, Rustom Mody wrote
> > Its a bit of a nuisance that we have to write set([1,2,3]) for
> > the first
> 
> Wait, what?
> 
> rosuav@sikorsky:~$ python
> Python 2.7.3 (default, Mar 13 2014, 11:03:55)
> [GCC 4.7.2] on linux2
> Type "help", "copyright", "credits" or "license" for more
> information.
> >>> type({1,2,3})
> <type 'set'>
> >>>
> rosuav@sikorsky:~$ python3
> Python 3.5.0a0 (default:4709290253e3, Jan 20 2015, 21:48:07)
> [GCC 4.7.2] on linux
> Type "help", "copyright", "credits" or "license" for more
> information.
> >>> type({1,2,3})
> <class 'set'>
> >>>
> 
> Looks like {1,2,3} works for me.

That hasn't always worked:

tim@bigbox:~$ python2.5
Python 2.5.5 (r255:77872, Nov 28 2010, 16:43:48) 
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type({1,2,3})
  File "<stdin>", line 1
    type({1,2,3})
           ^
SyntaxError: invalid syntax

tim@bigbox:~$ python2.6
Python 2.6.8 (unknown, Jan 26 2013, 14:35:25) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> type({1,2,3})
  File "<stdin>", line 1
    type({1,2,3})
           ^
SyntaxError: invalid syntax

And, prior to 2.4, you had to write

  from sets import Set as set

to even get sets.

  https://docs.python.org/2/whatsnew/2.4.html#pep-218-built-in-set-objects

-tkc











[toc] | [prev] | [next] | [standalone]


#84115

FromChris Angelico <rosuav@gmail.com>
Date2015-01-22 00:01 +1100
Message-ID<mailman.17912.1421845273.18130.python-list@python.org>
In reply to#84112
On Wed, Jan 21, 2015 at 11:55 PM, Tim Chase
<python.list@tim.thechases.com> wrote:
> On 2015-01-21 23:35, Chris Angelico wrote:
>> On Wed, Jan 21, 2015 at 11:09 PM, Rustom Mody wrote
>> > Its a bit of a nuisance that we have to write set([1,2,3]) for
>> > the first
>>
>> Looks like {1,2,3} works for me.
>
> That hasn't always worked:

Sure, but "we have to write" implies current status; Python 2.7 was
released in 2010, so I would expect that anyone who's complaining
about set notation should have access to it. Even if it didn't work in
2.7 and you had to use 3.x, the argument's still fairly weak when it's
alongside a pipe-dream desire to use specific mathematical Unicode
characters in source code, because that's clearly a 3.x-only feature
(where source code is Unicode text rather than ASCII).

Nobody's going to moan "It's silly that we have to use 1 and 0 instead
of nice keywords True and False" on the basis that True and False
didn't exist in Python 2.0. At very least, use 2.7 before you
complain; preferably, use 3.4 (or 3.5).

ChrisA

[toc] | [prev] | [next] | [standalone]


#84119

FromTim Chase <python.list@tim.thechases.com>
Date2015-01-21 08:26 -0600
Message-ID<mailman.17914.1421850254.18130.python-list@python.org>
In reply to#84112
On 2015-01-22 00:01, Chris Angelico wrote:
> On Wed, Jan 21, 2015 at 11:55 PM, Tim Chase
>>> Looks like {1,2,3} works for me.
>>
>> That hasn't always worked:
> 
> the argument's still fairly weak when it's alongside a pipe-dream
> desire to use specific mathematical Unicode characters in source
> code, because that's clearly a 3.x-only feature (where source code
> is Unicode text rather than ASCII).

I'm 100% in agreement that Unicode characters are a pipe-dream.  If I
wanted that, I'd use APL ;-)

> Nobody's going to moan "It's silly that we have to use 1 and 0
> instead of nice keywords True and False" on the basis that True and
> False didn't exist in Python 2.0. At very least, use 2.7 before you
> complain; preferably, use 3.4 (or 3.5).

While 2.0 is certainly antiquated, Red Hat Enterprise Linux (RHEL) is
often considered the best definition of what's considered "oldest
supported production environment".  RHEL v4 ships with Py2.3 and one
can still obtain extended support for this environment.  RHEL v5 is
actively supported (i.e., without the need for an extended-support
contract) and ships with Py2.4 so I generally try to at least support
2.4 when I'm writing code that could possibly end deploy on a server
such as RHEL5.   Some of us are stuck supporting code in such
antediluvian environments. :-/  Then again, if you're like me and
working in such environments, you already know to use set() instead
of {...} and to avoid the "with" statement, and the like. :)

Sources:
http://en.wikipedia.org/wiki/Red_Hat_Enterprise_Linux#Version_history

http://turbogears.org/1.0/docs/Install/Nix.html#centos-rhel RHEL=2.3

https://wiki.archlinux.org/index.php/python#Old_versions RHEL5=2.4

-tkc


[toc] | [prev] | [next] | [standalone]


#84120

FromChris Angelico <rosuav@gmail.com>
Date2015-01-22 01:31 +1100
Message-ID<mailman.17915.1421850697.18130.python-list@python.org>
In reply to#84112
On Thu, Jan 22, 2015 at 1:26 AM, Tim Chase
<python.list@tim.thechases.com> wrote:
> While 2.0 is certainly antiquated, Red Hat Enterprise Linux (RHEL) is
> often considered the best definition of what's considered "oldest
> supported production environment".  RHEL v4 ships with Py2.3 and one
> can still obtain extended support for this environment.  RHEL v5 is
> actively supported (i.e., without the need for an extended-support
> contract) and ships with Py2.4 so I generally try to at least support
> 2.4 when I'm writing code that could possibly end deploy on a server
> such as RHEL5.   Some of us are stuck supporting code in such
> antediluvian environments. :-/  Then again, if you're like me and
> working in such environments, you already know to use set() instead
> of {...} and to avoid the "with" statement, and the like. :)

I'm aware that there are reasons for wanting to support older versions
of things. I do it myself in several places (though not specifically
with Python pre-2.7). But there's still a difference between "Moan
moan, we have to use set([1,2,3]) when {1,2,3} would make *so* much
more sense" and "Sadly, I have to ensure that my code works on Python
2.4, so I can't take advantage of all the newer features". One is a
complaint about the language; the other is an acknowledgement of the
personal pain of having to support multiple versions (and is going to
get easier; as years go by, the oldest Python on a supported RHEL will
become newer).

ChrisA

[toc] | [prev] | [next] | [standalone]


#84122

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-01-22 01:47 +1100
Message-ID<54bfbc1d$0$13007$c3e8da3$5496439d@news.astraweb.com>
In reply to#84112
Rustom Mody wrote:

> On Wednesday, January 21, 2015 at 1:27:39 PM UTC+5:30, Stephen Hansen
> wrote:
[...]
> Among my teachers of CS, there were two – both brilliant — one taught me
> Numerical Analysis, the other taught me programming.

I wonder just how brilliant the Numerical Analysis guy really was.


> The point is that the two of them disagreed strongly on what programming
> was about.
> 
> The numerical analyst could see no earthly reason to use anything other
> than Fortran.

Would you consider a cook who boiled everything "brilliant"? Somebody who
insisted that there was no earthly reason to have an oven, a frying pan, a
wok, or a steamer?

Or somebody who insisted that the *one and only* use for chemistry was to
set fire to wood and boil water?

Computers are used for vastly more than just numerical analysis.


> The programmer had horror stories to tell about how Fortran damages the
> brain: eg programs that inherently need a while-loop (eg binary search)
> seem to be
> distinctly out of the reach of the Fortran programmer.  Recursion...
> And much else.
> 
> The numerical analyst of course had no use for this philosophizing.

And how did this brilliant numerical analyst perform fast, efficient
searches of sorted data?


> The view that strings and floats are more fundamental than containers and
> maps reminds me of this view.

A strange comment to make.

I can insert a string or a float inside a container or map, but I cannot
insert a container or map inside a string or float.

Consider the computer hardware we use. Certain data structures are
fundamental to the architecture we use: bytes, pointers, ints, arrays are
very low-level. Floats are a little more complex -- they have a more
complex internal structure than ints. Hash tables are more complex still.

 
> For me python is neat because I can write: [1,2,3]
> when I want a list.

> But it does not go nearly far enough.
> 
> I would like a set to be {1,2,3} or at worst ⦃1,2,3⦄
> and a bag to be ⟅1,2,3⟆

In Python, you can write sets {1, 2, 3}.

Out of curiosity, what input method did you use to get those Unicode
characters? How many keystrokes or mouse clicks did it take to get this?

\N{LEFT S-SHAPED BAG DELIMITER} 1, 2, 3 \N{RIGHT S-SHAPED BAG DELIMITER}


"bag([1, 2, 3])" requires only 16 keypresses, and it is visible on virtually
every computer, while ⟅ ⟆ look like small boxes to me and probably most
people.

[...]
> More irksome that for the second we've to preface with
> 
> from collections import Counter
> 
> And still more a PITA that a straightforward standard name like bag (or
> multiset) is called by such an ungoogleable misleading name as counter.

It is not misleading. collections.Counter is a class for counting (i.e. a
counter), not a bag. It is merely *similar* to a bag, but the API is not
the same as the usual bag API because the motivating design is not the same
as for bags.

As for being "ungoogleable", that is just simply false. Perhaps you forgot
to push the enter key after typing in your search terms?

https://www.google.com.au/search?q=python+counter

The top three links are:

https://docs.python.org/2/library/collections.html‎
https://docs.python.org/3.3/library/collections.html‎
http://pymotw.com/2/collections/counter.html

and seven out of the ten links on the first page of results are about
Counter.

Your results may differ from mine, since Google bubbles your searches. But
Duck Duck Go doesn't:

https://duckduckgo.com/?q=python%20counter

Its results are not quite as good as Google's, but it finds the backported
Counter class on PyPI (which Google didn't) and it too links to the Python
Module Of The Week page. Adding "collections" to the search terms brings
the results up to Google quality.

So as you can see, it simply is not true that "Counter" is ungoogleable.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#84129

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-01-21 09:15 -0700
Message-ID<mailman.17917.1421857401.18130.python-list@python.org>
In reply to#84122
On Wed, Jan 21, 2015 at 7:47 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>> More irksome that for the second we've to preface with
>>
>> from collections import Counter
>>
>> And still more a PITA that a straightforward standard name like bag (or
>> multiset) is called by such an ungoogleable misleading name as counter.
>
> It is not misleading. collections.Counter is a class for counting (i.e. a
> counter), not a bag. It is merely *similar* to a bag, but the API is not
> the same as the usual bag API because the motivating design is not the same
> as for bags.

To expand on this, Counter does provide a few multiset operations like
union and intersection. But there are a number of cases where it falls
short or does not perform as expected. To name a few:

* There are no subset or superset comparison operations provided.
* len(counter) returns the number of keys, not the number of elements.
* Unlike a multiset, Counters can have negative or zero counts; one
curious consequence of this is that an "empty" Counter can have a
non-zero length and evaluate as true in boolean contexts.

A while back I fiddled around with creating a true MultiSet class
using a Counter as the underlying implementation. Here's what I came
up with:


from collections import Counter
from collections.abc import MutableSet, Set


__all__ = ['MultiSet']


class MultiSet(MutableSet):

  """Multiset class containing hashable items."""

  # Class invariants:
  #   * self._counter is a Counter s/t all values are positive ints denoting
  #     multiplicity.
  #   * set(self) == self._counter.keys()

  def __init__(self, iterable_or_mapping=()):
    """Create a new, empty MultiSet. If passed an iterable argument, initialize
    the MultiSet with items from the iterable. If passed a mapping argument,
    initialize the MultiSet with the keys of the mapping, each repeated a number
    of times equal to the associated value.
    """
    self._counter = +Counter(iterable_or_mapping)

  @classmethod
  def _from_counter(cls, counter):
    self = cls.__new__(cls)
    self._counter = counter
    return self

  def __contains__(self, item):
    return item in self._counter

  def __hash__(self):
    raise TypeError('unhashable type: %s' % type(self))

  def __iter__(self):
    return self._counter.elements()

  def __len__(self):
    # TODO: consider caching the length.
    return sum(self._counter.values())

  def __repr__(self):
    if self:
      return 'MultiSet(%r)' % list(self)
    return 'MultiSet()'

  def add(self, item):
    """Add an element to the MultiSet."""
    self._counter[item] += 1

  def clear(self):
    """Remove all elements from the MultiSet, leaving it empty."""
    self._counter.clear()

  def count(self, item):
    """Return the number of occurrences of the element."""
    return self._counter[item]

  def discard(self, item):
    """Remove an element from the MultiSet.

    If there are multiple occurrences, remove only one such occurrence. If the
    element is not a member, do nothing.
    """
    if item not in self._counter:
      return
    if self._counter[item] == 1:
      del self._counter[item]
    else:
      self._counter[item] -= 1

  def __or__(self, other):
    """Return the union of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._from_counter(self._counter | _get_counter(other))

  __ror__ = __or__

  def __ior__(self, other):
    """Perform the in-place union of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    self._counter |= _get_counter(other)
    return self

  def __and__(self, other):
    """Return the intersection of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._from_counter(self._counter & _get_counter(other))

  __rand__ = __and__

  def __iand__(self, other):
    """Perform the in-place intersection of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    self._counter &= _get_counter(other)
    return self

  def __sub__(self, other):
    """Return the difference of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._from_counter(self._counter - _get_counter(other))

  def __rsub__(self, other):
    """Return the difference of another set and the MultiSet."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._from_counter(_get_counter(other) - self._counter)

  def __isub__(self, other):
    """Perform the in-place set difference of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    self._counter -= _get_counter(other)
    return self

  def __xor__(self, other):
    """Return the symmetric difference of the MultiSet and another set."""
    if not isinstance(other, Set):
      return NotImplemented
    # X ^ Y == (X - Y) | (Y - X)
    other_counter = _get_counter(other)
    counter = (self._counter - other_counter) | (other_counter - self._counter)
    return self._from_counter(counter)

  __rxor__ = __xor__

  def __ixor__(self, other):
    "Perform the in-place symmetric difference of the MultiSet and another set."
    if not isinstance(other, Set):
      return NotImplemented
    # X ^ Y == (X - Y) | (Y - X)
    other_counter = _get_counter(other)
    other_diff = other_counter - self._counter
    self._counter -= other_counter
    self._counter |= other_diff
    return self

  def __eq__(self, other):
    """Return whether the MultiSet and the other set have the same contents."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._counter == _get_counter(other)

  def __ne__(self, other):
    """Return whether the MultiSet and the other set have different contents."""
    if not isinstance(other, Set):
      return NotImplemented
    return self._counter != _get_counter(other)

  def __le__(self, other):
    """Return whether the MultiSet is a subset of the other set."""
    if not isinstance(other, Set):
      return NotImplemented
    other_counter = _get_counter(other)
    return all(self._counter[x] <= other_counter[x] for x in self._counter)

  def __lt__(self, other):
    """Return whether the MultiSet is a proper subset of the other set."""
    if not isinstance(other, Set):
      return NotImplemented
    return self.__le__(other) and not self.__eq__(other)

  def __ge__(self, other):
    """Return whether the MultiSet is a superset of the other set."""
    if not isinstance(other, Set):
      return NotImplemented
    other_counter = _get_counter(other)
    return all(self._counter[x] >= other_counter[x] for x in other_counter)

  def __gt__(self, other):
    """Return whether the MultiSet is a proper superset of the other set."""
    if not isinstance(other, Set):
      return NotImplemented
    return self.__ge__(other) and not self.__eq__(other)


def _get_counter(maybe_multiset):
  if isinstance(maybe_multiset, MultiSet):
    return maybe_multiset._counter
  else:
    return Counter(maybe_multiset)

[toc] | [prev] | [next] | [standalone]


#84131

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-01-21 10:27 -0700
Message-ID<mailman.17918.1421861290.18130.python-list@python.org>
In reply to#84122
On Wed, Jan 21, 2015 at 9:15 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> class MultiSet(MutableSet):

In retrospect this probably shouldn't derive from MutableSet, since
that carries the expectation that all elements are unique (much like
how bool shouldn't be subclassed). For instance, collections.Set
includes some operators that aren't compatible with the second operand
being a multiset.

[toc] | [prev] | [next] | [standalone]


#84064

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-20 05:33 -0800
Message-ID<d34dbfbe-fe82-47dc-8bc3-c8773e2b70dd@googlegroups.com>
In reply to#84060
On Tuesday, January 20, 2015 at 11:38:27 AM UTC+5:30, Terry Reedy wrote:
> On 1/19/2015 5:06 PM, Zachary Gilmartin wrote:
> > Why aren't there trees in the python standard library?
> 
> Sequences nested withing sequences can be regarded as trees, and Python 
> has these.  I regard Lisp as a tree processing languages, as it must be 
> to manipulate, for example, code with nested structures.

Yeah python has trees alright.

Heres' some simple tree-code

from enum import Enum
class TreeTag(Enum):
    I = 0  # An internal node
    L = 1  # A leaf node
    def __repr__(self):  return self.name
    
I = TreeTag.I
L = TreeTag.L


def dfs(t):
    if t[0] == L:
        return [t[1]]
    else:
        return [t[1]] + dfs(t[2]) + dfs(t[3])

# Converting to generators is trivial
=====================

# Example tree
# From http://www-math.ucdenver.edu/~wcherowi/courses/m4408/gtln8.html

t1 = [L, 1]
t4 = [I, 4, [L, 3],[L,5]]
t2 = [I, 2, t1, t4]
t8 = [I, 8, [L, 7], [L, 9]]

# Top level tree
t = [I, 6, t2, t8]


======================
An REPL session

>>> t
[I, 6, [I, 2, [L, 1], [I, 4, [L, 3], [L, 5]]], [I, 8, [L, 7], [L, 9]]]
>>> dfs(t)
[6, 2, 1, 4, 3, 5, 8, 7, 9]
>>> t8
[I, 8, [L, 7], [L, 9]]
>>> dfs(t8)
[8, 7, 9]
>>> 

[toc] | [prev] | [next] | [standalone]


#84065

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-20 05:51 -0800
Message-ID<fb48919c-f1e9-466e-a9b9-4adc9b65e2ab@googlegroups.com>
In reply to#84064
On Tuesday, January 20, 2015 at 7:03:56 PM UTC+5:30, Rustom Mody wrote:
> On Tuesday, January 20, 2015 at 11:38:27 AM UTC+5:30, Terry Reedy wrote:
> > On 1/19/2015 5:06 PM, Zachary Gilmartin wrote:
> > > Why aren't there trees in the python standard library?
> > 
> > Sequences nested withing sequences can be regarded as trees, and Python 
> > has these.  I regard Lisp as a tree processing languages, as it must be 
> > to manipulate, for example, code with nested structures.
> 
> Yeah python has trees alright.

It may be best to read my example above in this order:
1. See the picture at http://www-math.ucdenver.edu/~wcherowi/courses/m4408/gtln8.html

2. Take the python representation of that

>>> t
[I, 6, [I, 2, [L, 1], [I, 4, [L, 3], [L, 5]]], [I, 8, [L, 7], [L, 9]]] 

Indent that a bit:

[I, 6, 
   [I, 2, 
       [L, 1], 
       [I, 4, 
           [L, 3], 
           [L, 5]]], 
   [I, 8, 
       [L, 7], 
       [L, 9]]] 

Now compare with the picture above
Only catch is you must see it 'lying down'
Shouldn't be too hard given that we've all got used to seeing :-) as a smiley :-)

[toc] | [prev] | [next] | [standalone]


#84067

FromMarko Rauhamaa <marko@pacujo.net>
Date2015-01-20 16:15 +0200
Message-ID<87fvb58rmz.fsf@elektro.pacujo.net>
In reply to#84064
Rustom Mody <rustompmody@gmail.com>:

> Yeah python has trees alright.

Does Python balance them for you?


Marko

[toc] | [prev] | [next] | [standalone]


#84069

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-20 06:35 -0800
Message-ID<915e5162-ae50-4af0-a2e3-8a81e2d0ba79@googlegroups.com>
In reply to#84067
On Tuesday, January 20, 2015 at 7:46:02 PM UTC+5:30, Marko Rauhamaa wrote:
> Rustom Mody :
> 
> > Yeah python has trees alright.
> 
> Does Python balance them for you?

No

Does python support a menagerie of lists like C
- singly linked, doubly linked, with header, without header etc?

Or access to machine registers like assembly?

And are these differences from C/Assembly in favor or against python?

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web