Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #17366 > unrolled thread

Making the case for "typed" lists/iterators in python

Started byNathan Rice <nathan.alexander.rice@gmail.com>
First post2011-12-16 12:48 -0500
Last post2011-12-19 10:56 -0500
Articles 10 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  Making the case for "typed" lists/iterators in python Nathan Rice <nathan.alexander.rice@gmail.com> - 2011-12-16 12:48 -0500
    Re: Making the case for "typed" lists/iterators in python Roy Smith <roy@panix.com> - 2011-12-16 13:05 -0500
      Re: Making the case for "typed" lists/iterators in python Chris Angelico <rosuav@gmail.com> - 2011-12-17 05:25 +1100
      Re: Making the case for "typed" lists/iterators in python Arnaud Delobelle <arnodel@gmail.com> - 2011-12-16 18:38 +0000
      Re: Making the case for "typed" lists/iterators in python Chris Angelico <rosuav@gmail.com> - 2011-12-17 05:43 +1100
        Re: Making the case for "typed" lists/iterators in python Mel Wilson <mwilson@the-wire.com> - 2011-12-16 15:14 -0500
      Re: Making the case for "typed" lists/iterators in python Ben Finney <ben+python@benfinney.id.au> - 2011-12-17 07:17 +1100
      Re: Making the case for "typed" lists/iterators in python Terry Reedy <tjreedy@udel.edu> - 2011-12-16 17:21 -0500
    Re: Making the case for "typed" lists/iterators in python Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-12-19 14:16 +0100
      Re: Making the case for "typed" lists/iterators in python Nathan Rice <nathan.alexander.rice@gmail.com> - 2011-12-19 10:56 -0500

#17366 — Making the case for "typed" lists/iterators in python

FromNathan Rice <nathan.alexander.rice@gmail.com>
Date2011-12-16 12:48 -0500
SubjectMaking the case for "typed" lists/iterators in python
Message-ID<mailman.3739.1324057724.27778.python-list@python.org>
I realize this has been discussed in the past, I hope that I am
presenting a slightly different take on the subject that will prove
interesting.  This is primarily motivated by my annoyance with using
comprehensions in certain circumstances.

Currently, if you want to perform successive transformations on the
elements of a list, a couple of options:

1. Successive comprehensions:

L2 = [X(e) for e in L1]
L3 = [Y(e) for e in L2]
L4 = [Z(e) for e in L3]
or
L2 = [e.X() for e in L1]

This gets the job done and gives you access to all the intermediate
values, but isn't very succinct, particularly if you are in the habit
of using informative identifiers.

2. One comprehension:

L2 = [Z(X(Y(e))) for e in L1]
or
L2 = [e.X().Y().Z() for e in L1]

This gets the job done, but doesn't give you access to all the
intermediate values, and tends to be pretty awful to read.

Having "typed" lists let you take preexisting string/int/etc methods
and expose them in a vectorized context and provides an easy way for
developers to support both vectors and scalars in a single function
(you could easily "fix" other people's functions dynamically to
support both).  Additionally, "typed" lists/iterators will allow
improved code analysis and optimization.  The PyPy people have already
stated that they are working on implementing different strategies for
lists composed of a single type, so clearly there is already community
movement in this direction.

Just compare the above examples to their type-aware counterparts:

L2 = X(L1)
L2 = L1.X()

L2 = Z(Y(X(L1)))
L2 = L1.X().Y().Z()

Also, this would provide a way to clean up stuff like:

"\n".join(l.capitalize() for l in my_string.split("\n"))

into:

my_string.split("\n").capitalize().join_this("\n")

Before anyone gets up in arms at the idea of statically typed python,
what I am suggesting here would be looser than that.  Basically, I
believe it would be a good idea in instances where it is known that a
list of single type is going to be returned, to return a list subclass
(for example, StringList, IntegerList, etc).  To avoid handcuffing
people with types, the standard list modification methods could be
hooked so that if an object of an incorrect type is placed in the
list, a warning is raised and the list converts to a generic object
list.  The only stumbling block is that you can't use __class__ to
convert from stack types to heap types in CPython.  My workaround for
this would be to have a factory that creates generic "List" classes,
modifying the bases to produce the correct behavior.  Then, converting
from a typed list to a generic object list would just be a matter of
removing a member from the bases for a class.  This of course
basically kills the ability to perform type specific list optimization
in CPython, but that isn't necessarily true for other implementations.
 The additional type information would be preserved for code analysis
in any case.  The case would be even simpler for generators and other
iterators, as you don't have to worry about mutation.

I'd like to hear people's thoughts on the subject.  Currently we are
throwing away useful information in many cases that could be used for
code analysis, optimization and simpler interfaces.  I believe that
"typed" lists that get "demoted" to normal lists with a warning on out
of type operations preserve this information while providing complete
backwards compatibility and freedom.

Nathan

[toc] | [next] | [standalone]


#17370

FromRoy Smith <roy@panix.com>
Date2011-12-16 13:05 -0500
Message-ID<roy-CE17C0.13053616122011@news.panix.com>
In reply to#17366
In article <mailman.3739.1324057724.27778.python-list@python.org>,
 Nathan Rice <nathan.alexander.rice@gmail.com> wrote:

> I'd like to hear people's thoughts on the subject.  Currently we are
> throwing away useful information in many cases that could be used for
> code analysis, optimization and simpler interfaces. 

Most of this was TL:DNR, but I will admit I often wish for a better way 
to log intermediate values.  For example, a common pattern in the code 
I'm working with now is functions that end in:

   return [Foo(x) for x in bunch_of_x_thingies]

When something goes amiss and I want to debug the problem, I often 
transform that into:

    temp = [Foo(x) for x in bunch_of_x_thingies]
    logger.debug(temp)
    return temp

It would be convenient to be able to get at and log the intermediate 
value without having to pull it out to an explicit temporary.

[toc] | [prev] | [next] | [standalone]


#17373

FromChris Angelico <rosuav@gmail.com>
Date2011-12-17 05:25 +1100
Message-ID<mailman.3745.1324059914.27778.python-list@python.org>
In reply to#17370
On Sat, Dec 17, 2011 at 5:05 AM, Roy Smith <roy@panix.com> wrote:
> Most of this was TL:DNR, but I will admit I often wish for a better way
> to log intermediate values.  For example, a common pattern in the code
> I'm working with now is functions that end in:
>
>   return [Foo(x) for x in bunch_of_x_thingies]
>
> When something goes amiss and I want to debug the problem, I often
> transform that into:
>
>    temp = [Foo(x) for x in bunch_of_x_thingies]
>    logger.debug(temp)
>    return temp
>
> It would be convenient to be able to get at and log the intermediate
> value without having to pull it out to an explicit temporary.

tee = lambda func,arg: (func(arg),arg)[1]

return tee(logger.debug,[Foo(x) for x in bunch_of_x_thingies])

ChrisA

[toc] | [prev] | [next] | [standalone]


#17375

FromArnaud Delobelle <arnodel@gmail.com>
Date2011-12-16 18:38 +0000
Message-ID<mailman.3746.1324060688.27778.python-list@python.org>
In reply to#17370
On 16 December 2011 18:25, Chris Angelico <rosuav@gmail.com> wrote:

> tee = lambda func,arg: (func(arg),arg)[1]

What a strange way to spell it!

def tee(func, arg):
    func(arg)
    return arg

-- 
Arnaud

[toc] | [prev] | [next] | [standalone]


#17376

FromChris Angelico <rosuav@gmail.com>
Date2011-12-17 05:43 +1100
Message-ID<mailman.3747.1324061000.27778.python-list@python.org>
In reply to#17370
On Sat, Dec 17, 2011 at 5:38 AM, Arnaud Delobelle <arnodel@gmail.com> wrote:
> On 16 December 2011 18:25, Chris Angelico <rosuav@gmail.com> wrote:
>
>> tee = lambda func,arg: (func(arg),arg)[1]
>
> What a strange way to spell it!
>
> def tee(func, arg):
>    func(arg)
>    return arg

I started with that version and moved to the lambda for compactness.
But either way works.

It's no more strange than the way some people omit the u from colour. :)

ChrisA

[toc] | [prev] | [next] | [standalone]


#17381

FromMel Wilson <mwilson@the-wire.com>
Date2011-12-16 15:14 -0500
Message-ID<jcg8rm$mhd$1@speranza.aioe.org>
In reply to#17376
Chris Angelico wrote:

> It's no more strange than the way some people omit the u from colour. :)

Bonum Petronio Arbiteri, bonum mihi.

	Mel.

[toc] | [prev] | [next] | [standalone]


#17382

FromBen Finney <ben+python@benfinney.id.au>
Date2011-12-17 07:17 +1100
Message-ID<87hb10z2yc.fsf@benfinney.id.au>
In reply to#17370
Roy Smith <roy@panix.com> writes:

> When something goes amiss and I want to debug the problem, I often 
> transform that into:
>
>     temp = [Foo(x) for x in bunch_of_x_thingies]
>     logger.debug(temp)
>     return temp
>
> It would be convenient to be able to get at and log the intermediate 
> value without having to pull it out to an explicit temporary.

What's wrong with that though?

You're not “pulling it out to” anything; you're binding a name to the
value in order to do several things with it. Exactly what you say you
want to do. It's explicit and clear.

-- 
 \      “What we usually pray to God is not that His will be done, but |
  `\                       that He approve ours.” —Helga Bergold Gross |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#17390

FromTerry Reedy <tjreedy@udel.edu>
Date2011-12-16 17:21 -0500
Message-ID<mailman.3755.1324074096.27778.python-list@python.org>
In reply to#17370
On 12/16/2011 1:05 PM, Roy Smith wrote:

> I'm working with now is functions that end in:
>
>     return [Foo(x) for x in bunch_of_x_thingies]
>
> When something goes amiss and I want to debug the problem, I often
> transform that into:
>
>      temp = [Foo(x) for x in bunch_of_x_thingies]
>      logger.debug(temp)
>      return temp
>
> It would be convenient to be able to get at and log the intermediate
> value without having to pull it out to an explicit temporary.

Decorate the function with @logreturn and you do not have to touch the 
function code. If the logging module does not now have such a decorator 
predefined (I simply do not know), perhaps it should.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#17510

FromUlrich Eckhardt <ulrich.eckhardt@dominolaser.com>
Date2011-12-19 14:16 +0100
Message-ID<q2q3s8-8fu.ln1@satorlaser.homedns.org>
In reply to#17366
Am 16.12.2011 18:48, schrieb Nathan Rice:
> I realize this has been discussed in the past, I hope that I am
> presenting a slightly different take on the subject that will prove
> interesting.  This is primarily motivated by my annoyance with using
> comprehensions in certain circumstances.
[...]
> Having "typed" lists let you take preexisting string/int/etc methods
> and expose them in a vectorized context and provides an easy way for
> developers to support both vectors and scalars in a single function
> (you could easily "fix" other people's functions dynamically to
> support both).  Additionally, "typed" lists/iterators will allow
> improved code analysis and optimization.  The PyPy people have
> already stated that they are working on implementing different
> strategies for lists composed of a single type, so clearly there is
> already community movement in this direction.
>
> Just compare the above examples to their type-aware counterparts:
>
> L2 = X(L1) L2 = L1.X()
>
> L2 = Z(Y(X(L1))) L2 = L1.X().Y().Z()
>
> Also, this would provide a way to clean up stuff like:
>
> "\n".join(l.capitalize() for l in my_string.split("\n"))
>
> into:
>
> my_string.split("\n").capitalize().join_this("\n")
>
> Before anyone gets up in arms at the idea of statically typed
> python, what I am suggesting here would be looser than that.
> Basically, I believe it would be a good idea in instances where it is
> known that a list of single type is going to be returned, to return a
> list subclass (for example, StringList, IntegerList, etc).  To avoid
> handcuffing people with types, the standard list modification methods
> could be hooked so that if an object of an incorrect type is placed
> in the list, a warning is raised and the list converts to a generic
> object list.  The only stumbling block is that you can't use
> __class__ to convert from stack types to heap types in CPython.  My
> workaround for this would be to have a factory that creates generic
> "List" classes, modifying the bases to produce the correct behavior.
> Then, converting from a typed list to a generic object list would
> just be a matter of removing a member from the bases for a class.
> This of course basically kills the ability to perform type specific
> list optimization in CPython, but that isn't necessarily true for
> other implementations. The additional type information would be
> preserved for code analysis in any case.  The case would be even
> simpler for generators and other iterators, as you don't have to
> worry about mutation.
>
> I'd like to hear people's thoughts on the subject.  Currently we are
> throwing away useful information in many cases that could be used
> for code analysis, optimization and simpler interfaces.

I think there are two aspects to your idea:
1. collections that share a single type
2. accessing multiple elements via a common interface

Both are things that should be considered and I think both are useful in 
some contexts. The former would provide additional guarantees, for 
example, you could savely look up an attribute of the type only once 
while iterating over the sequence and use it for all elements. Also, I 
believe you could save some storage.

The second aspect would mean that you have a single function call that 
targets multiple objects, which is syntactic sugar, but that's a good 
thing. To some extent, this looks like a C++ valarray, see e.g. [1] and 
[2] (note that I don't trust [1] and that [2] is perhaps a bit 
outdated), in case you know C++ and want to draw some inspiration from this.

Anyway, I believe something like that would already be possible today, 
which would give people something they could actually try out instead of 
just musing about:

    class ValarrayWrapper(object):
        def __init__(self, elements):
            self._elements = elements
        def map(self, function):
            tmp = [function(x) for x in self._elements]
            return ValarrayWrapper(tmp)
        def apply(self, function)
            self._elements[:] = [function(x) for x in self._elements]

I could even imagine this to implement "generic" attribute lookup by 
looking at the first element. If it contains the according attribute, 
return a proxy that allows calls to member functions or property access, 
depending on the type of the attribute.


> I believe that "typed" lists that get "demoted" to normal lists with
> a warning on out of type operations preserve this information while
> providing complete backwards compatibility and freedom.

I don't think that a warning helps people write correct code, a 
meaningful error does. Otherwise, with the same argument you could 
convert a tuple on the fly to a list when someone tries to change an 
element of it.


Cheers!

Uli



[1] http://www.cplusplus.com/reference/std/valarray/abs/
[2] http://drdobbs.com/184403620

[toc] | [prev] | [next] | [standalone]


#17515

FromNathan Rice <nathan.alexander.rice@gmail.com>
Date2011-12-19 10:56 -0500
Message-ID<mailman.3822.1324310185.27778.python-list@python.org>
In reply to#17510
> I think there are two aspects to your idea:
> 1. collections that share a single type
> 2. accessing multiple elements via a common interface

You are correct, and I now regret posing them in a coupled manner.

> Both are things that should be considered and I think both are useful in
> some contexts. The former would provide additional guarantees, for example,
> you could savely look up an attribute of the type only once while iterating
> over the sequence and use it for all elements. Also, I believe you could
> save some storage.
>
> The second aspect would mean that you have a single function call that
> targets multiple objects, which is syntactic sugar, but that's a good thing.
> To some extent, this looks like a C++ valarray, see e.g. [1] and [2] (note
> that I don't trust [1] and that [2] is perhaps a bit outdated), in case you
> know C++ and want to draw some inspiration from this.
>
> Anyway, I believe something like that would already be possible today, which
> would give people something they could actually try out instead of just
> musing about:
>
>   class ValarrayWrapper(object):
>       def __init__(self, elements):
>           self._elements = elements
>       def map(self, function):
>           tmp = [function(x) for x in self._elements]
>           return ValarrayWrapper(tmp)
>       def apply(self, function)
>           self._elements[:] = [function(x) for x in self._elements]
>
> I could even imagine this to implement "generic" attribute lookup by looking
> at the first element. If it contains the according attribute, return a proxy
> that allows calls to member functions or property access, depending on the
> type of the attribute.

Thank you for the references, I am always interested to see how other
languages solve problems.

I have received the "code please" comment repeatedly, I will have to
take some time after work today to deliver.

>> I believe that "typed" lists that get "demoted" to normal lists with
>> a warning on out of type operations preserve this information while
>> providing complete backwards compatibility and freedom.
>
>
> I don't think that a warning helps people write correct code, a meaningful
> error does. Otherwise, with the same argument you could convert a tuple on
> the fly to a list when someone tries to change an element of it.

I do agree errors are more normative than warnings.  The problem with
an error in these circumstances is it will certainly break code
somewhere.  Perhaps a warning that becomes an error at some point in
the future would be the prudent way to go.

Thanks!

Nathan

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web