Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #39082 > unrolled thread

Differences creating tuples and collections.namedtuples

Started byJohn Reid <johnbaronreid@gmail.com>
First post2013-02-18 03:47 -0800
Last post2013-02-19 09:36 +0000
Articles 13 on this page of 33 — 11 participants

Back to article view | Back to comp.lang.python


Contents

  Differences creating tuples and collections.namedtuples John Reid <johnbaronreid@gmail.com> - 2013-02-18 03:47 -0800
    Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-18 12:03 +0000
    Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-18 12:05 +0000
    Re: Differences creating tuples and collections.namedtuples Dave Angel <davea@davea.name> - 2013-02-18 07:11 -0500
    Re: Differences creating tuples and collections.namedtuples John Reid <johnbaronreid@gmail.com> - 2013-02-18 13:49 +0000
    Re: Differences creating tuples and collections.namedtuples John Reid <johnbaronreid@gmail.com> - 2013-02-18 13:51 +0000
    Re: Differences creating tuples and collections.namedtuples John Reid <j.reid@mail.cryst.bbk.ac.uk> - 2013-02-18 14:09 +0000
      Re: Differences creating tuples and collections.namedtuples raymond.hettinger@gmail.com - 2013-02-18 23:48 -0800
        Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 08:06 +0000
        Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 08:57 +0000
        Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 08:06 +0000
        Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 08:05 +0000
      Re: Differences creating tuples and collections.namedtuples raymond.hettinger@gmail.com - 2013-02-18 23:48 -0800
    Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-18 14:15 +0000
    Re: Differences creating tuples and collections.namedtuples John Reid <johnbaronreid@gmail.com> - 2013-02-18 14:18 +0000
    Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-18 14:12 +0000
    Re: Differences creating tuples and collections.namedtuples John Reid <j.reid@mail.cryst.bbk.ac.uk> - 2013-02-18 14:23 +0000
    Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-18 14:53 +0000
    Re: Differences creating tuples and collections.namedtuples John Reid <j.reid@mail.cryst.bbk.ac.uk> - 2013-02-18 15:07 +0000
    Re: Differences creating tuples and collections.namedtuples Terry Reedy <tjreedy@udel.edu> - 2013-02-18 16:28 -0500
      Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 11:18 +1100
        Re: Differences creating tuples and collections.namedtuples Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-02-19 01:43 +0000
          Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-19 14:06 +1100
            Re: Differences creating tuples and collections.namedtuples Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-02-19 21:27 +1300
        Re: Differences creating tuples and collections.namedtuples Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-02-19 20:54 +1300
        Re: Differences creating tuples and collections.namedtuples John Reid <j.reid@mail.cryst.bbk.ac.uk> - 2013-02-19 09:30 +0000
        Re: Differences creating tuples and collections.namedtuples Terry Reedy <tjreedy@udel.edu> - 2013-02-19 22:38 -0500
          Re: Differences creating tuples and collections.namedtuples Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-22 10:38 +0000
        Re: Differences creating tuples and collections.namedtuples Chris Angelico <rosuav@gmail.com> - 2013-02-20 17:50 +1100
          Re: Differences creating tuples and collections.namedtuples Roy Smith <roy@panix.com> - 2013-02-20 09:09 -0500
      Re: Differences creating tuples and collections.namedtuples Roy Smith <roy@panix.com> - 2013-02-18 20:11 -0500
    Re: Differences creating tuples and collections.namedtuples alex23 <wuwei23@gmail.com> - 2013-02-18 17:47 -0800
      Re: Differences creating tuples and collections.namedtuples John Reid <j.reid@mail.cryst.bbk.ac.uk> - 2013-02-19 09:36 +0000

Page 2 of 2 — ← Prev page 1 [2]


#39146

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-02-19 11:18 +1100
Message-ID<5122c4d7$0$29982$c3e8da3$5496439d@news.astraweb.com>
In reply to#39130
Terry Reedy wrote:

> On 2/18/2013 6:47 AM, John Reid wrote:
> 
>> I was hoping namedtuples could be used as replacements for tuples
>  >  in all instances.
> 
> This is a mistake in the following two senses. First, tuple is a class
> with instances while namedtuple is a class factory that produces
> classes. (One could think of namedtuple as a metaclass, but it was not
> implemented that way.) 


I think you have misunderstood. I don't believe that John wants to use the
namedtuple factory instead of tuple. He wants to use a namedtuple type
instead of tuple.

That is, given:

Point3D = namedtuple('Point3D', 'x y z')

he wants to use a Point3D instead of a tuple. Since:

issubclass(Point3D, tuple) 

holds true, the Liskov Substitution Principle (LSP) tells us that anything
that is true for a tuple should also be true for a Point3D. That is, given
that instance x might be either a builtin tuple or a Point3D, all of the
following hold:

- isinstance(x, tuple) returns True
- len(x) returns the length of x
- hash(x) returns the hash of x
- x[i] returns item i of x, or raises IndexError
- del x[i] raises TypeError
- x + a_tuple returns a new tuple
- x.count(y) returns the number of items equal to y

etc. Basically, any code expecting a tuple should continue to work if you
pass it a Point3D instead (or any other namedtuple).

There is one conspicuous exception to this: the constructor:

type(x)(args)

behaves differently depending on whether x is a builtin tuple, or a Point3D.

The LSP is about *interfaces* and the contracts we make about those
interfaces, rather than directly about inheritance. Inheritance is just a
mechanism for allowing types to automatically get the same interface as
another type. Another way to put this, LSP is about duck-typing. In this
case, if we have two instances:

x = (1, 2, 3)
y = Point3D(4, 5, 6)


then x and y:

- quack like tuples
- swim like tuples
- fly like tuples
- walk like tuples
- eat the same things as tuples
- taste very nice cooked with orange sauce like tuples

etc., but y does not lay eggs like x. The x constructor requires a single
argument, the y constructor requires multiple arguments.

You can read more about LSP here:

http://en.wikipedia.org/wiki/Liskov_substitution_principle

although I don't think this is the most readable Wikipedia article, and the
discussion of mutability is a red-herring. Or you can try this:

http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple

although even by c2 wiki standards, it's a bit of a mess. These might help
more:

http://blog.thecodewhisperer.com/2013/01/08/liskov-substitution-principle-demystified/

http://lassala.net/2010/11/04/a-good-example-of-liskov-substitution-principle/


> Second, a tuple instance can have any length and 
> different instances can have different lengths. On the other hand, all
> instances of a particular namedtuple class have a fixed length.

This is a subtle point. If your contract is, "I must be able to construct an
instance with a variable number of items", then namedtuples are not
substitutable for builtin tuples. But I think this is an *acceptable*
violation of LSP, since we're deliberately restricting a namedtuple to a
fixed length. But within the constraints of that fixed length, we should be
able to substitute a namedtuple for any tuple of that same length.


> This 
> affects their initialization. So does the fact that Oscar mentioned,
> that fields can be initialized by name.

Constructing namedtuples by name is not a violation, since it *adds*
behaviour, it doesn't take it away. If you expect a tuple, you cannot
construct it with:

t = tuple(spam=a, ham=b, eggs=c)

since that doesn't work. You have to construct it from an iterable, or more
likely a literal:

t = (a, b, c)

Literals are special, since they are a property of the *interpreter*, not
the tuple type. To put it another way, the interpreter understands (a,b,c)
as syntax for constructing a tuple, the tuple type does not. So we cannot
expect to use (a,b,c) syntax to construct a MyTuple instance, or a Point3D
instance instead.

If we hope to substitute a subclass, we have to use the tuple constructor
directly:

type_to_use = tuple
t = type_to_use([a, b, c])

Duck-typing, and the LSP, tells us that we should be able to substitute a
Point3D for this:

type_to_use = namedtuple('Point3D', 'x y z')
t = type_to_use([a, b, c])

but we can't. And that is an important violation of LSP.

There could be three fixes to this, none of them practical:

1) tuple could accept multiple arguments, tuple(a, b, c) => (a, b, c) but
that conflicts with the use tuple(iterable). If Python had * argument
unpacking way back in early days, it might have been better to give tuples
the signature tuple(*args), but it didn't and so it doesn't and we can't
change that now.

2) namedtuples could accept a single iterable argument like tuple does, but
that conflicts with the desired signature pt = Point3D(1, 2, 3).

3) namedtuples should not claim to be tuples, which is probably the
least-worst fix. Backwards-compatibility rules out making this change, but
even if it didn't, namedtuples quack like tuples, swim like tuples, and
walk like tuples, so even if they aren't a subclass of tuple it would still
be reasonable to want them to lay eggs like tuples.

So I don't believe there is any good solution to this, except the ad-hoc one
of overriding the __new__ constructor when needed.


>  > There seem to be some differences between how tuples and namedtuples
>  > are created. For example with a tuple I can do:
>>
>> a=tuple([1,2,3])
> 
> But no sensible person would ever do that, since it creates an
> unnecessary list and is equivalent to
> 
> a = 1,2,3

Well, no, not as given. But it should be read as just an illustration. In
practise, code like this is not uncommon:

a = tuple(some_iterable)

 
[...]
> It is much less common to change tuple(iterable) to B(iterable).

Less common or not, duck-typing and the LSP tells us we should be able to do
so. We cannot.


>> Is this a problem with namedtuples, ipython or just a feature?
> 
> With canSequence. If isinstance was available and the above were written
> before list and tuple could be subclassed, canSequence was sensible when
> written. But as Oscar said, it is now a mistake for canSequence to
> assume that all subclasses of list and tuple have the same
> initialization api.

No, it is not a mistake. It is a problem with namedtuples that they violate
the expectation that they should have the same constructor signature as
other tuples. After all, namedtuples *are* tuples, they should be
constructed the same way. But they aren't, so that violates a reasonable
expectation.

Is the convenience of being able to write Point3D(1, 2, 3) more important
than LSP-purity? Perhaps. I suspect that will be the answer Raymond
Hettinger might give. I'm 85% inclined to agree with this answer.


> In fact, one reason to subclass a class is to change the initialization
> api.

That might be a reason that people give, but it's a bad reason from the
perspective of interface contracts, duck-typing and the LSP.

Of course, these are not the *only* perspectives. There is no rule that
states that one must always obey the interface contracts of one's parent
class. But if you don't, you will be considered an "ill-behaved" subclass
for violating the promises made by your type.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#39161

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2013-02-19 01:43 +0000
Message-ID<mailman.1991.1361238215.2939.python-list@python.org>
In reply to#39146
On 19 February 2013 00:18, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Terry Reedy wrote:
>> On 2/18/2013 6:47 AM, John Reid wrote:
[snip]
>>> Is this a problem with namedtuples, ipython or just a feature?
>>
>> With canSequence. If isinstance was available and the above were written
>> before list and tuple could be subclassed, canSequence was sensible when
>> written. But as Oscar said, it is now a mistake for canSequence to
>> assume that all subclasses of list and tuple have the same
>> initialization api.
>
> No, it is not a mistake. It is a problem with namedtuples that they violate
> the expectation that they should have the same constructor signature as
> other tuples. After all, namedtuples *are* tuples, they should be
> constructed the same way. But they aren't, so that violates a reasonable
> expectation.

It is a mistake. A namedtuple class instance provides all of the
methods/operators provided by a tuple. This should be sufficient to
fill the tuplishness contract. Requiring that obj satisfy a contract
is one thing. When you get to the point of requiring that type(obj)
must do so as well you have gone beyond duck-typing and the normal
bounds of poly-morphism.

It's still unclear what the purpose of canSequence is, but I doubt
that there isn't a better way that it (and its related functions)
could be implemented that would not have this kind of problem.


Oscar

[toc] | [prev] | [next] | [standalone]


#39177

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-02-19 14:06 +1100
Message-ID<5122ec31$0$29966$c3e8da3$5496439d@news.astraweb.com>
In reply to#39161
Oscar Benjamin wrote:

> On 19 February 2013 00:18, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> Terry Reedy wrote:
>>> On 2/18/2013 6:47 AM, John Reid wrote:
> [snip]
>>>> Is this a problem with namedtuples, ipython or just a feature?
>>>
>>> With canSequence. If isinstance was available and the above were written
>>> before list and tuple could be subclassed, canSequence was sensible when
>>> written. But as Oscar said, it is now a mistake for canSequence to
>>> assume that all subclasses of list and tuple have the same
>>> initialization api.
>>
>> No, it is not a mistake. It is a problem with namedtuples that they
>> violate the expectation that they should have the same constructor
>> signature as other tuples. After all, namedtuples *are* tuples, they
>> should be constructed the same way. But they aren't, so that violates a
>> reasonable expectation.
> 
> It is a mistake. A namedtuple class instance provides all of the
> methods/operators provided by a tuple. This should be sufficient to
> fill the tuplishness contract.

"Should be", but *doesn't*. 

If your code expects a tuple, then it should work with a tuple. Namedtuples
are tuples, but they don't work where builtin tuples work, because their
__new__ method has a different signature.

I can understand arguing that this is "acceptable breakage" for various
reasons -- practicality beats purity. I can't understand arguing that the
principle is wrong.


> Requiring that obj satisfy a contract 
> is one thing. When you get to the point of requiring that type(obj)
> must do so as well you have gone beyond duck-typing and the normal
> bounds of poly-morphism.

Constructor contracts are no less important than other contracts. I'm going
to give what I hope is an example that is *so obvious* that nobody will
disagree.

Consider the dict constructor dict.fromkeys:

py> mydict = {'a':1}
py> mydict.fromkeys(['ham', 'spam', 'eggs'])
{'eggs': None, 'ham': None, 'spam': None}


Now I subclass dict:

py> class MyDict(dict):
...     @classmethod
...     def fromkeys(cls, func):
...         # Expects a callback function that gets called with no arguments
...         # and returns two items, a list of keys and a default value.
...         return super(MyDict, cls).fromkeys(*func())
...

Why would I change the syntax like that? Because reasons. Good or bad,
what's done is done and there is my subclass. Here is an instance:

py> mydict = MyDict({'a': 1})
py> isinstance(mydict, dict)
True


Great! So I pass mydict to a function that expects a dict. This ought to
work, because mydict *is* a dict. It duck-types as a dict, isinstance
agrees it is a dict. What could possibly go wrong?

What goes wrong is that some day I pass it to a function that calls
mydict.fromkeys in the usual fashion, and it blows up.

py> mydict.fromkeys(['spam', 'ham', 'eggs'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fromkeys
TypeError: 'list' object is not callable

How is this possible? Is mydict not a dict? It should be usable anywhere I
can use a dict. How is this possibly acceptable behaviour for something
which claims to be a dict?

This is a violation of the Liskov Substitution Principle, and a violation of
normal expectations that if mydict quacks like a dict, it should lay eggs
like a duck.

That namedtuple's constructor is __new__ rather than fromkeys is an
irrelevant distraction. The principle still applies. It is perfectly
reasonable to expect that if instance t is a tuple, then *any* method on t
should have the same signature, regardless of whether that method is
called "index", "__getitem__", or "__new__".

If this fundamental principle is violated, there should be a very good
reason, and not just because "constructor contracts aren't important".


> It's still unclear what the purpose of canSequence is, but I doubt
> that there isn't a better way that it (and its related functions)
> could be implemented that would not have this kind of problem.

Incorrect. The problem is with *namedtuples*, not canSequence, because
namedtuples promise to implement a strict superset of the behaviour of
builtin tuples, while in fact they actually *take behaviour away*. Tuples
promise to allow calls to the constructor like this:

any_tuple.__new__(type(any_typle), iterable))

but that fails if any_tuple is a namedtuple.

I am not arguing for or against the idea that this is an *acceptable*
breakage, give other requirements. But from the point of view of interface
contracts, it is a breakage, and as the Original Poster discovered, it can
and will break code.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#39191

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2013-02-19 21:27 +1300
Message-ID<aogrbfFak08U1@mid.individual.net>
In reply to#39177
Steven D'Aprano wrote:
> py> class MyDict(dict):
> ...     @classmethod
> ...     def fromkeys(cls, func):
> ...         # Expects a callback function that gets called with no arguments
> ...         # and returns two items, a list of keys and a default value.
> ...         return super(MyDict, cls).fromkeys(*func())

Here you've overridden a method with one having a
different signature. That's not something you'd
normally do, because, being a method, it's likely
to get invoked polymorphically.

Constructors, on the other hand, are usually *not*
invoked polymorphically. Most of the time we know
exactly which constructor we're calling, because we
write the class name explicitly at the point of call.

Consequently, we have a different attitude when it
comes to constructors. We choose not to require LSP
for constructors, because it turns out to be very
useful not to be bound by that constraint.
Practicality beats purity here.

The reason IPython gets into trouble is that it tries
to make a polymorphic call to something that nobody
expects to need to be polymorphic.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#39189

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2013-02-19 20:54 +1300
Message-ID<aogpeaFa6h0U1@mid.individual.net>
In reply to#39146
Steven D'Aprano wrote:

> Terry Reedy wrote:

>>In fact, one reason to subclass a class is to change the initialization
>>api.

> That might be a reason that people give, but it's a bad reason from the
> perspective of interface contracts, duck-typing and the LSP.

Only if you're going to pass the class off to something as
a factory function.

Note that having a different constructor signature is *not*
an LSP violation for *instances* of a class. The constructor
is not part of the interface for instances, only for the
class itself.

In practice, it's very common for a class to have a different
constructor signature from its base class, and this rarely
causes any problem.

IPython is simply making a dodgy assumption. It gets away with
it only because it's very rare to encounter subclasses of
list or tuple at all.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#39195

FromJohn Reid <j.reid@mail.cryst.bbk.ac.uk>
Date2013-02-19 09:30 +0000
Message-ID<mailman.2010.1361266249.2939.python-list@python.org>
In reply to#39146

On 19/02/13 00:18, Steven D'Aprano wrote:
> Terry Reedy wrote:
> 
>> On 2/18/2013 6:47 AM, John Reid wrote:
>>
>>> I was hoping namedtuples could be used as replacements for tuples
>>  >  in all instances.
>>
>> This is a mistake in the following two senses. First, tuple is a class
>> with instances while namedtuple is a class factory that produces
>> classes. (One could think of namedtuple as a metaclass, but it was not
>> implemented that way.) 
> 
> 
> I think you have misunderstood. I don't believe that John wants to use the
> namedtuple factory instead of tuple. He wants to use a namedtuple type
> instead of tuple.
> 
> That is, given:
> 
> Point3D = namedtuple('Point3D', 'x y z')
> 
> he wants to use a Point3D instead of a tuple. Since:
> 
> issubclass(Point3D, tuple) 
> 
> holds true, the Liskov Substitution Principle (LSP) tells us that anything
> that is true for a tuple should also be true for a Point3D. That is, given
> that instance x might be either a builtin tuple or a Point3D, all of the
> following hold:
> 
> - isinstance(x, tuple) returns True
> - len(x) returns the length of x
> - hash(x) returns the hash of x
> - x[i] returns item i of x, or raises IndexError
> - del x[i] raises TypeError
> - x + a_tuple returns a new tuple
> - x.count(y) returns the number of items equal to y
> 
> etc. Basically, any code expecting a tuple should continue to work if you
> pass it a Point3D instead (or any other namedtuple).
> 
> There is one conspicuous exception to this: the constructor:
> 
> type(x)(args)
> 
> behaves differently depending on whether x is a builtin tuple, or a Point3D.
> 

Exactly and thank you Steven for explaining it much more clearly.

[toc] | [prev] | [next] | [standalone]


#39305

FromTerry Reedy <tjreedy@udel.edu>
Date2013-02-19 22:38 -0500
Message-ID<mailman.2080.1361331548.2939.python-list@python.org>
In reply to#39146
On 2/18/2013 7:18 PM, Steven D'Aprano wrote:
> Terry Reedy wrote:
>
>> On 2/18/2013 6:47 AM, John Reid wrote:
>>
>>> I was hoping namedtuples could be used as replacements for tuples
>>>  in all instances.
>>
>> This is a mistake in the following two senses. First, tuple is a class
>> with instances while namedtuple is a class factory that produces
>> classes. (One could think of namedtuple as a metaclass, but it was not
>> implemented that way.)

> I think you have misunderstood.

Wrong, which should be evident to anyone who reads the entire paragraph 
as the complete thought exposition it was meant to be. Beside which, 
this negative ad hominem comment is irrelevant to the rest of your post 
about the Liskov Substitution Principle.

The rest of the paragraph, in two more pieces:

>> Second, a tuple instance can have any length and
>> different instances can have different lengths. On the other hand, all
>> instances of a particular namedtuple class have a fixed length.

In other words, neither the namedtuple object nor any namedtuple class 
object can fully substitute for the tuple class object. Nor can 
instances of any namedtuple class fully substitute for instances of the 
tuple class. Therefore, I claim, the hope that "namedtuples could be 
used as replacements for tuples in all instances" is a futile hope, 
however one interprets that hope.

 >> This affects their initialization.

Part of the effect is independent of initialization. Even if namedtuples 
were initialized by iterator, there would still be glitches. In 
particular, even if John's named tuple class B *could* be initialized as 
B((1,2,3)), it still could not be substituted for t in the code below.

 >>> t = (1,2,3)
 >>> type(t) is type(t[1:])
True
 >>> type(t)(t[1:])
(2, 3)

As far as read access goes, B effectively is a tuple. As soon as one 
uses type() directly or indirectly (by creating new objects), there may 
be problems. That is because the addition of field names *also* adds a 
length constraint, which is a subtraction of flexibility.

---
Liskov Substitution Principle (LSP): I met this over 15 years ago 
reading debates among OOP enthusiasts about whether Rectangle should be 
a subclass of Square or Square a subclass of Rectangle, and similarly, 
whether Ostrich can be a legitimate subclass of Bird.

The problem I see with the LSP for modeling either abstract or concrete 
entities is that we in fact do define subclasses by subtraction or 
limitation, as well as by augmentation, while the LSP only allows the 
latter.

On answer to the conundrums above to to add Parallelepiped as a 
superclass for both Square and Rectangle and Flying_bird as an 
additional subclass of Bird. But then the question becomes: Does obeying 
the LSP count as 'necessity' when one is trying to follow Ockham's 
principle of not multiplying classes without necessity?

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#39537

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-02-22 10:38 +0000
Message-ID<51274a8b$0$29988$c3e8da3$5496439d@news.astraweb.com>
In reply to#39305
On Tue, 19 Feb 2013 22:38:32 -0500, Terry Reedy wrote:

> On 2/18/2013 7:18 PM, Steven D'Aprano wrote:
>> Terry Reedy wrote:
>>
>>> On 2/18/2013 6:47 AM, John Reid wrote:
>>>
>>>> I was hoping namedtuples could be used as replacements for tuples
>>>>  in all instances.
>>>
>>> This is a mistake in the following two senses. First, tuple is a class
>>> with instances while namedtuple is a class factory that produces
>>> classes. (One could think of namedtuple as a metaclass, but it was not
>>> implemented that way.)
>
>> I think you have misunderstood.
>
> Wrong, which should be evident to anyone who reads the entire paragraph
> as the complete thought exposition it was meant to be. Beside which,
> this negative ad hominem comment is irrelevant to the rest of your post
> about the Liskov Substitution Principle.

Terry, I'm sorry that I have stood on your toes here, no offense was 
intended. It seemed to me, based on the entire paragraph that you wrote, 
that you may have misunderstood the OP's question. The difference in 
signatures between the namedtuple class factory and tuple is irrelevant, 
as I can now see you understand, but by raising it in the first place 
you gave me the impression that you may have misunderstood what the OP 
was attempting to do.


> The rest of the paragraph, in two more pieces:
>
>>> Second, a tuple instance can have any length and different instances
>>> can have different lengths. On the other hand, all instances of a
>>> particular namedtuple class have a fixed length.
>
> In other words, neither the namedtuple object nor any namedtuple class
> object can fully substitute for the tuple class object. Nor can
> instances of any namedtuple class fully substitute for instances of the
> tuple class. Therefore, I claim, the hope that "namedtuples could be
> used as replacements for tuples in all instances" is a futile hope,
> however one interprets that hope.

I did discuss the fixed length issue directly, and agreed with you that 
if your contract is to construct variable-length tuples, then a 
fixed-length namedtuple is not substitutable.

But in practice, one common use-case for tuples (whether named or not) 
is for fixed-length records, and in that use-case, a namedtuple of 
length N should be substitutable for a tuple of length N.


>  >> This affects their initialization.
>
> Part of the effect is independent of initialization. Even if namedtuples
> were initialized by iterator, there would still be glitches. In
> particular, even if John's named tuple class B *could* be initialized as
> B((1,2,3)), it still could not be substituted for t in the code below.
>
>  >>> t = (1,2,3)
>  >>> type(t) is type(t[1:])
> True

Agreed. There are other differences as well, e.g. repr(t) will differ 
between builtin tuples and namedtuples. The only type which is identical 
in every conceivable aspect to tuple is tuple itself. Any subclass or 
subtype[1] must by definition differ in at least one aspect from tuple:

type(some_tuple) is type(())

and in practice will differ in other aspects as well.


  Footnote: [1] Subclass meaning it inherits from tuple; subtype in the
  sense that it duck-types as a tuple, but may or may not share any
  implementation.


LSP cannot be interpreted in isolation. Any non-trivial modification of 
a class will change *something* about the class, after all that's why we 
subclassed it in the first place. Either the interface will be 
different, or the semantics will be different, or both. LSP must always 
be interpreted in the intersection between the promises made by the 
class and the promises your application cares about.

Some promises are more important than others, hence some violations are 
more serious than others. For instance, I think that tuple indexing is a 
critical promise: a "tuple" that cannot be indexed is surely not a tuple. 
The exact form of the repr() of a tuple is generally not important at 
all: a tuple that prints as MyBunchOStuff(...) is still a tuple. In my 
experience, the constructor signature is of moderate importance. But of 
course that depends on what promises you rely on, if you are relying on 
the tuple constructor, then it is critical *to you*.


> The problem I see with the LSP for modeling either abstract or concrete
> entities is that we in fact do define subclasses by subtraction or
> limitation, as well as by augmentation, while the LSP only allows the
> latter.

People do all sorts of things. They write code that is O(N**2) or worse, 
they call eval() on untrusted data, they use isinstance() and break 
duck-typing, etc. That they break LSP does not necessarily mean that 
they should. LSP is one of the five fundamental best-practices for 
object-oriented code, "SOLID":

http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29

Breaking any of the SOLID principles is a code-smell. That does not mean 
that there is never a good reason to do so, but SOLID is a set of 
principle which have stood the test of time and practice. Any code that 
breaks one of those principles should be should be considered smelly, or 
worse, until justified.

(And for the avoidance of doubt, I am more than satisfied with the 
justification given for the difference in signature between tuples and 
namedtuples.)



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#39318

FromChris Angelico <rosuav@gmail.com>
Date2013-02-20 17:50 +1100
Message-ID<mailman.2087.1361343028.2939.python-list@python.org>
In reply to#39146
On Wed, Feb 20, 2013 at 2:38 PM, Terry Reedy <tjreedy@udel.edu> wrote:
> Liskov Substitution Principle (LSP): I met this over 15 years ago reading
> debates among OOP enthusiasts about whether Rectangle should be a subclass
> of Square or Square a subclass of Rectangle, and similarly, whether Ostrich
> can be a legitimate subclass of Bird.
>
> The problem I see with the LSP for modeling either abstract or concrete
> entities is that we in fact do define subclasses by subtraction or
> limitation, as well as by augmentation, while the LSP only allows the
> latter.

A plausible compromise is to demand LSP in terms of programming, but
not necessarily functionality. So an Ostrich would have a fly() method
that returns some kind of failure, in the same way that any instance
of any flying-bird could have injury or exhaustion that prevents it
from flying. It still makes sense to attempt to fly - an ostrich IS a
bird - but it just won't succeed.

ChrisA

[toc] | [prev] | [next] | [standalone]


#39359

FromRoy Smith <roy@panix.com>
Date2013-02-20 09:09 -0500
Message-ID<roy-2C1FC9.09090820022013@news.panix.com>
In reply to#39318
In article <mailman.2087.1361343028.2939.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> On Wed, Feb 20, 2013 at 2:38 PM, Terry Reedy <tjreedy@udel.edu> wrote:
> > Liskov Substitution Principle (LSP): I met this over 15 years ago reading
> > debates among OOP enthusiasts about whether Rectangle should be a subclass
> > of Square or Square a subclass of Rectangle, and similarly, whether Ostrich
> > can be a legitimate subclass of Bird.
> >
> > The problem I see with the LSP for modeling either abstract or concrete
> > entities is that we in fact do define subclasses by subtraction or
> > limitation, as well as by augmentation, while the LSP only allows the
> > latter.
> 
> A plausible compromise is to demand LSP in terms of programming, but
> not necessarily functionality. So an Ostrich would have a fly() method
> that returns some kind of failure, in the same way that any instance
> of any flying-bird could have injury or exhaustion that prevents it
> from flying. It still makes sense to attempt to fly - an ostrich IS a
> bird - but it just won't succeed.
> 
> ChrisA

I would think Ostrich.fly() should raise NotImplementedError.  Whether 
this violates LSP or not, depends on how you define Bird.

class Bird:
    def fly(self):
        """Commence aviation.

           Note: not all birds can fly.  Calling fly() on
           a flightless bird will raise NotImplementedError.

        """

class Ostrich(Bird):
    def fly(self):
       raise NotImplementedError("ostriches can't fly")

class Sheep(Bird):
    def fly(self):
        self.plummet()

[toc] | [prev] | [next] | [standalone]


#39151

FromRoy Smith <roy@panix.com>
Date2013-02-18 20:11 -0500
Message-ID<roy-C195D2.20113818022013@news.panix.com>
In reply to#39130
Terry Reedy <tjreedy@udel.edu> wrote:

> Initializaing a namedtuple from an iterable is unusual, and 
> hence gets the longer syntax. I

I quick look through our codebase agrees with that.  I found 27 
namedtuple classes.  21 were initialized with MyTuple(x, y, z) syntax.  
Three used MyTuple(*data).

Most interesting were the three that used MyTuple(**data).  In all three 
cases, data was a dictionary returned by re.match.groupdict().  The 
definition of the namedtuple was even built by introspecting the regex 
to find all the named groups!

[toc] | [prev] | [next] | [standalone]


#39163

Fromalex23 <wuwei23@gmail.com>
Date2013-02-18 17:47 -0800
Message-ID<0c9a97b8-60f7-4050-aa65-1907bec75d8d@ou9g2000pbb.googlegroups.com>
In reply to#39082
On Feb 18, 9:47 pm, John Reid <johnbaronr...@gmail.com> wrote:
> See http://article.gmane.org/gmane.comp.python.ipython.user/10270 for more info.

One quick workaround would be to use a tuple where required and then
coerce it back to Result when needed as such:

def sleep(secs):
    import os, time, parallel_helper
    start = time.time()
    time.sleep(secs)
    return tuple(parallel_helper.Result(os.getpid(), time.time() -
start))

rc = parallel.Client()
v = rc.load_balanced_view()
async_result = v.map_async(sleep, range(3, 0, -1), ordered=False)
for ar in async_result:
    print parallel_helper.Result(*ar)

You can of course skip the creation of Result in sleep and only turn
it into one in the display loop, but it all depends on additional
requirements (and adds some clarity to what is happening, I think).

[toc] | [prev] | [next] | [standalone]


#39196

FromJohn Reid <j.reid@mail.cryst.bbk.ac.uk>
Date2013-02-19 09:36 +0000
Message-ID<mailman.2011.1361266628.2939.python-list@python.org>
In reply to#39163

On 19/02/13 01:47, alex23 wrote:
> On Feb 18, 9:47 pm, John Reid <johnbaronr...@gmail.com> wrote:
>> See http://article.gmane.org/gmane.comp.python.ipython.user/10270 for more info.
> 
> One quick workaround would be to use a tuple where required and then
> coerce it back to Result when needed as such:
> 
> def sleep(secs):
>     import os, time, parallel_helper
>     start = time.time()
>     time.sleep(secs)
>     return tuple(parallel_helper.Result(os.getpid(), time.time() -
> start))
> 
> rc = parallel.Client()
> v = rc.load_balanced_view()
> async_result = v.map_async(sleep, range(3, 0, -1), ordered=False)
> for ar in async_result:
>     print parallel_helper.Result(*ar)
> 
> You can of course skip the creation of Result in sleep and only turn
> it into one in the display loop, but it all depends on additional
> requirements (and adds some clarity to what is happening, I think).
> 

Thanks all I really need is a quick work around but it is always nice to
discuss these things. Also this class decorator seems to do the job for
ipython although it does change the construction syntax a little and is
probablty overkill. No doubt the readers of this list can improve it
somewhat as well.


import logging
_logger = logging.getLogger(__name__)
from collections import namedtuple

def make_ipython_friendly(namedtuple_class):
    """A class decorator to make namedtuples more ipython friendly.
    """
    _logger.debug('Making %s ipython friendly.', namedtuple_class.__name__)

    # Preserve original new to use if needed with keyword arguments
    original_new = namedtuple_class.__new__

    def __new__(cls, *args, **kwds):
        _logger.debug('In decorator __new__, cls=%s', cls)
        if args:
            if kwds:
                raise TypeError('Cannot construct %s from an positional
and keyword arguments.', namedtuple_class.__name__)
            _logger.debug('Assuming construction from an iterable.')
            return namedtuple_class._make(*args)
        else:
            _logger.debug('Assuming construction from keyword arguments.')
            return original_new(namedtuple_class, **kwds)

    namedtuple_class.__new__ = staticmethod(__new__) # set the class'
__new__ to the new one
    del namedtuple_class.__getnewargs__ # get rid of getnewargs

    return namedtuple_class

Result = make_ipython_friendly(namedtuple('Result', 'pid duration'))

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web