Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63955 > unrolled thread

Chanelling Guido - dict subclasses

Started bySteven D'Aprano <steve+comp.lang.python@pearwood.info>
First post2014-01-15 01:27 +0000
Last post2014-01-16 17:17 +1300
Articles 12 — 11 participants

Back to article view | Back to comp.lang.python


Contents

  Chanelling Guido - dict subclasses Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-15 01:27 +0000
    Re: Chanelling Guido - dict subclasses Ned Batchelder <ned@nedbatchelder.com> - 2014-01-14 21:04 -0500
    Re: Chanelling Guido - dict subclasses Terry Reedy <tjreedy@udel.edu> - 2014-01-14 22:48 -0500
    Re: Chanelling Guido - dict subclasses F <f@hop2it.be> - 2014-01-15 07:00 +0000
    Re: Chanelling Guido - dict subclasses Peter Otten <__peter__@web.de> - 2014-01-15 09:40 +0100
      Re: Chanelling Guido - dict subclasses John Ladasky <john_ladasky@sbcglobal.net> - 2014-01-15 08:51 -0800
        Re: Chanelling Guido - dict subclasses Peter Otten <__peter__@web.de> - 2014-01-15 19:35 +0100
        Re: Chanelling Guido - dict subclasses Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-01-15 22:30 -0800
    Re: Chanelling Guido - dict subclasses Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-15 09:10 +0000
    Re: Chanelling Guido - dict subclasses Tim Chase <python.list@tim.thechases.com> - 2014-01-15 05:03 -0600
    Re: Chanelling Guido - dict subclasses Daniel da Silva <var.mail.daniel@gmail.com> - 2014-01-15 19:50 -0500
      Re: Chanelling Guido - dict subclasses Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-01-16 17:17 +1300

#63955 — Chanelling Guido - dict subclasses

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-01-15 01:27 +0000
SubjectChanelling Guido - dict subclasses
Message-ID<52d5e408$0$29970$c3e8da3$5496439d@news.astraweb.com>
Over on the Python-Dev mailing list, there is an ENORMOUS multi-thread 
discussion involving at least two PEPs, about bytes/str compatibility. 
But I don't want to talk about that. (Oh gods, I *really* don't want to 
talk about that...)

In the midst of that discussion, Guido van Rossum made a comment about 
subclassing dicts:

    [quote]
    From: Guido van Rossum <guido@python.org>
    Date: Tue, 14 Jan 2014 12:06:32 -0800
    Subject: Re: [Python-Dev] PEP 460 reboot

    Personally I wouldn't add any words suggesting or referring 
    to the option of creation another class for this purpose. You 
    wouldn't recommend subclassing dict for constraining the 
    types of keys or values, would you?
    [end quote]

https://mail.python.org/pipermail/python-dev/2014-January/131537.html

This surprises me, and rather than bother Python-Dev (where it will 
likely be lost in the noise, and certain will be off-topic), I'm hoping 
there may be someone here who is willing to attempt to channel GvR. I 
would have thought that subclassing dict for the purpose of constraining 
the type of keys or values would be precisely an excellent use of 
subclassing.


class TextOnlyDict(dict):
    def __setitem__(self, key, value):
        if not isinstance(key, str):
            raise TypeError
        super().__setitem__(key, value)
    # need to override more methods too


But reading Guido, I think he's saying that wouldn't be a good idea. I 
don't get it -- it's not a violation of the Liskov Substitution 
Principle, because it's more restrictive, not less. What am I missing?


-- 
Steven

[toc] | [next] | [standalone]


#63956

FromNed Batchelder <ned@nedbatchelder.com>
Date2014-01-14 21:04 -0500
Message-ID<mailman.5489.1389751476.18130.python-list@python.org>
In reply to#63955
On 1/14/14 8:27 PM, Steven D'Aprano wrote:
> Over on the Python-Dev mailing list, there is an ENORMOUS multi-thread
> discussion involving at least two PEPs, about bytes/str compatibility.
> But I don't want to talk about that. (Oh gods, I *really* don't want to
> talk about that...)
>
> In the midst of that discussion, Guido van Rossum made a comment about
> subclassing dicts:
>
>      [quote]
>      From: Guido van Rossum <guido@python.org>
>      Date: Tue, 14 Jan 2014 12:06:32 -0800
>      Subject: Re: [Python-Dev] PEP 460 reboot
>
>      Personally I wouldn't add any words suggesting or referring
>      to the option of creation another class for this purpose. You
>      wouldn't recommend subclassing dict for constraining the
>      types of keys or values, would you?
>      [end quote]
>
> https://mail.python.org/pipermail/python-dev/2014-January/131537.html
>
> This surprises me, and rather than bother Python-Dev (where it will
> likely be lost in the noise, and certain will be off-topic), I'm hoping
> there may be someone here who is willing to attempt to channel GvR. I
> would have thought that subclassing dict for the purpose of constraining
> the type of keys or values would be precisely an excellent use of
> subclassing.
>
>
> class TextOnlyDict(dict):
>      def __setitem__(self, key, value):
>          if not isinstance(key, str):
>              raise TypeError
>          super().__setitem__(key, value)
>      # need to override more methods too
>
>
> But reading Guido, I think he's saying that wouldn't be a good idea. I
> don't get it -- it's not a violation of the Liskov Substitution
> Principle, because it's more restrictive, not less. What am I missing?
>
>

One problem with it is that there are lots of ways of setting values in 
the dict, and they don't use your __setitem__:

 >>> tod = TextOnlyDict()
 >>> tod.update({1: "haha"})
 >>>

This is what you're getting at with your "need to override more methods 
too", but it turns out to be a pain to override enough methods.

I don't know if that is what Guido was getting at, I suspect he was 
talking at a more refined "principles of object design" level rather 
than "dicts don't happen to work that way" level.

Also, I've never done it, but I understand that deriving from 
collections.MutableMapping avoids this problem.

-- 
Ned Batchelder, http://nedbatchelder.com

[toc] | [prev] | [next] | [standalone]


#63961

FromTerry Reedy <tjreedy@udel.edu>
Date2014-01-14 22:48 -0500
Message-ID<mailman.5492.1389757706.18130.python-list@python.org>
In reply to#63955
On 1/14/2014 8:27 PM, Steven D'Aprano wrote:

> In the midst of that discussion, Guido van Rossum made a comment about
> subclassing dicts:
>
>      [quote]
>      From: Guido van Rossum <guido@python.org>
>      Date: Tue, 14 Jan 2014 12:06:32 -0800
>      Subject: Re: [Python-Dev] PEP 460 reboot
>
>      Personally I wouldn't add any words suggesting or referring
>      to the option of creation another class for this purpose. You
>      wouldn't recommend subclassing dict for constraining the
>      types of keys or values, would you?
>      [end quote]
>
> https://mail.python.org/pipermail/python-dev/2014-January/131537.html
>
> This surprises me,

I was slightly surprised too. I understand not wanting to add a subclass 
to stdlib, but I believe this was about adding words to the doc. Perhaps 
he did not want to over-emphasize one particular possible subclass by 
putting the words in the doc.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#63964

FromF <f@hop2it.be>
Date2014-01-15 07:00 +0000
Message-ID<lb5bn0$2iq$1@dont-email.me>
In reply to#63955
I can't speak for Guido but I think it is messy and unnatural and will lead to user frustration.
 As a user, I would expect a dict to take any hashable as key and any object as value when using one. I would probably just provide a __getitem__ method in a normal class in your case. 

This said I have overriden dict before, but my child class only added to dict, I didn't change it's underlying behaviour so you can use my class(es) as a vanilla dict everywhere, which enforcing types would have destroyed.



On 01:27 15/01 , Steven D'Aprano wrote:
>Over on the Python-Dev mailing list, there is an ENORMOUS multi-thread 
>discussion involving at least two PEPs, about bytes/str compatibility. 
>But I don't want to talk about that. (Oh gods, I *really* don't want to 
>talk about that...)
>
>In the midst of that discussion, Guido van Rossum made a comment about 
>subclassing dicts:
>
>    [quote]
>    From: Guido van Rossum <guido@python.org>
>    Date: Tue, 14 Jan 2014 12:06:32 -0800
>    Subject: Re: [Python-Dev] PEP 460 reboot
>
>    Personally I wouldn't add any words suggesting or referring 
>    to the option of creation another class for this purpose. You 
>    wouldn't recommend subclassing dict for constraining the 
>    types of keys or values, would you?
>    [end quote]
>
>https://mail.python.org/pipermail/python-dev/2014-January/131537.html
>
>This surprises me, and rather than bother Python-Dev (where it will 
>likely be lost in the noise, and certain will be off-topic), I'm hoping 
>there may be someone here who is willing to attempt to channel GvR. I 
>would have thought that subclassing dict for the purpose of constraining 
>the type of keys or values would be precisely an excellent use of 
>subclassing.
>
>
>class TextOnlyDict(dict):
>    def __setitem__(self, key, value):
>        if not isinstance(key, str):
>            raise TypeError
>        super().__setitem__(key, value)
>    # need to override more methods too
>
>
>But reading Guido, I think he's saying that wouldn't be a good idea. I 
>don't get it -- it's not a violation of the Liskov Substitution 
>Principle, because it's more restrictive, not less. What am I missing?
>
>
>-- 
>Steven 
>

-- 
yrNews Usenet Reader for iOS
http://appstore.com/yrNewsUsenetReader

[toc] | [prev] | [next] | [standalone]


#63965

FromPeter Otten <__peter__@web.de>
Date2014-01-15 09:40 +0100
Message-ID<mailman.5494.1389775244.18130.python-list@python.org>
In reply to#63955
Steven D'Aprano wrote:

> In the midst of that discussion, Guido van Rossum made a comment about
> subclassing dicts:
> 
>     [quote]

>     Personally I wouldn't add any words suggesting or referring
>     to the option of creation another class for this purpose. You
>     wouldn't recommend subclassing dict for constraining the
>     types of keys or values, would you?
>     [end quote]

> This surprises me, and rather than bother Python-Dev (where it will
> likely be lost in the noise, and certain will be off-topic), I'm hoping
> there may be someone here who is willing to attempt to channel GvR. I
> would have thought that subclassing dict for the purpose of constraining
> the type of keys or values would be precisely an excellent use of
> subclassing.
> 
> 
> class TextOnlyDict(dict):
>     def __setitem__(self, key, value):
>         if not isinstance(key, str):
>             raise TypeError

Personally I feel dirty whenever I write Python code that defeats duck-
typing -- so I would not /recommend/ any isinstance() check.
I realize that this is not an argument...

PS: I tried to read GvR's remark in context, but failed. It's about time to 
to revolt and temporarily install the FLUFL as our leader, long enough to 
revoke Guido's top-posting license, but not long enough to reintroduce the 
<> operator...

[toc] | [prev] | [next] | [standalone]


#64000

FromJohn Ladasky <john_ladasky@sbcglobal.net>
Date2014-01-15 08:51 -0800
Message-ID<05ff1332-1776-4ac0-88b4-84f8fd323ce3@googlegroups.com>
In reply to#63965
On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
> Personally I feel dirty whenever I write Python code that defeats duck-
> typing -- so I would not /recommend/ any isinstance() check.

While I am inclined to agree, I have yet to see a solution to the problem of flattening nested lists/tuples which avoids isinstance().  If anyone has written one, I would like to see it, and consider its merits.

[toc] | [prev] | [next] | [standalone]


#64012

FromPeter Otten <__peter__@web.de>
Date2014-01-15 19:35 +0100
Message-ID<mailman.5540.1389810978.18130.python-list@python.org>
In reply to#64000
John Ladasky wrote:

> On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
>> Personally I feel dirty whenever I write Python code that defeats duck-
>> typing -- so I would not /recommend/ any isinstance() check.
> 
> While I am inclined to agree, I have yet to see a solution to the problem
> of flattening nested lists/tuples which avoids isinstance().  If anyone
> has written one, I would like to see it, and consider its merits.

Well, you should always be able to find some property that discriminates 
what you want to treat as sequences from what you want to treat as atoms.

(flatten() Adapted from a nine-year-old post by Nick Craig-Wood
<https://mail.python.org/pipermail/python-list/2004-December/288112.html>)

>>> def flatten(items, check):
...     if check(items):
...         for item in items:
...             yield from flatten(item, check)
...     else:
...         yield items
... 
>>> items = [1, 2, (3, 4), [5, [6, (7,)]]]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "sort"))))
[1, 2, (3, 4), 5, 6, (7,)]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "count"))))
[1, 2, 3, 4, 5, 6, 7]

The approach can of course break

>>> items = ["foo", 1, 2, (3, 4), [5, [6, (7,)]]]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "count"))))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 4, in flatten
  File "<stdin>", line 2, in flatten
RuntimeError: maximum recursion depth exceeded

and I'm the first to admit that the fix below looks really odd:

>>> print(list(flatten(items, check=lambda o: hasattr(o, "count") and not 
hasattr(o, "split"))))
['foo', 1, 2, 3, 4, 5, 6, 7]

In fact all of the following examples look more natural...

>>> print(list(flatten(items, check=lambda o: isinstance(o, list))))
['foo', 1, 2, (3, 4), 5, 6, (7,)]
>>> print(list(flatten(items, check=lambda o: isinstance(o, (list, 
tuple)))))
['foo', 1, 2, 3, 4, 5, 6, 7]
>>> print(list(flatten(items, check=lambda o: isinstance(o, (list, tuple)) 
or (isinstance(o, str) and len(o) > 1))))
['f', 'o', 'o', 1, 2, 3, 4, 5, 6, 7]

... than the duck-typed variants because it doesn't matter for the problem 
of flattening whether an object can be sorted or not. But in a real-world 
application the "atoms" are more likely to have something in common that is 
required for the problem at hand, and the check for it with 

def check(obj):
    return not (obj is an atom) # pseudo-code

may look more plausible.

[toc] | [prev] | [next] | [standalone]


#64060

FromDevin Jeanpierre <jeanpierreda@gmail.com>
Date2014-01-15 22:30 -0800
Message-ID<mailman.5574.1389853901.18130.python-list@python.org>
In reply to#64000
On Wed, Jan 15, 2014 at 8:51 AM, John Ladasky
<john_ladasky@sbcglobal.net> wrote:
> On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
>> Personally I feel dirty whenever I write Python code that defeats duck-
>> typing -- so I would not /recommend/ any isinstance() check.
>
> While I am inclined to agree, I have yet to see a solution to the problem of flattening nested lists/tuples which avoids isinstance().  If anyone has written one, I would like to see it, and consider its merits.

As long as you're the one that created the nested list structure, you
can choose to create a different structure instead, one which doesn't
require typechecking values inside your structure.

For example, os.walk has a similar kind of problem; it uses separate
lists for the subdirectories and the rest of the files, rather than
requiring you to check each child to see if it is a directory. It can
do it this way because it doesn't need to preserve the interleaved
order of directories and files, but there's other solutions for you if
you do want to preserve that order. (Although they won't be as clean
as they would be in a language with ADTs)

-- Devin

[toc] | [prev] | [next] | [standalone]


#63966

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-01-15 09:10 +0000
Message-ID<mailman.5496.1389777056.18130.python-list@python.org>
In reply to#63955
On 15/01/2014 01:27, Steven D'Aprano wrote:
> Over on the Python-Dev mailing list, there is an ENORMOUS multi-thread
> discussion involving at least two PEPs, about bytes/str compatibility.
> But I don't want to talk about that. (Oh gods, I *really* don't want to
> talk about that...)

+ trillions

>
> In the midst of that discussion, Guido van Rossum made a comment about
> subclassing dicts:
>
>      [quote]
>      From: Guido van Rossum <guido@python.org>
>      Date: Tue, 14 Jan 2014 12:06:32 -0800
>      Subject: Re: [Python-Dev] PEP 460 reboot
>
>      Personally I wouldn't add any words suggesting or referring
>      to the option of creation another class for this purpose. You
>      wouldn't recommend subclassing dict for constraining the
>      types of keys or values, would you?
>      [end quote]
>
> https://mail.python.org/pipermail/python-dev/2014-January/131537.html
>
> This surprises me, and rather than bother Python-Dev (where it will
> likely be lost in the noise, and certain will be off-topic), I'm hoping
> there may be someone here who is willing to attempt to channel GvR. I
> would have thought that subclassing dict for the purpose of constraining
> the type of keys or values would be precisely an excellent use of
> subclassing.

Exactly what I was thinking.

>
>
> class TextOnlyDict(dict):
>      def __setitem__(self, key, value):
>          if not isinstance(key, str):
>              raise TypeError
>          super().__setitem__(key, value)
>      # need to override more methods too
>
>
> But reading Guido, I think he's saying that wouldn't be a good idea. I
> don't get it -- it's not a violation of the Liskov Substitution
> Principle, because it's more restrictive, not less. What am I missing?
>
>

Couple of replies I noted from Ned Batchelder and Terry Reedy.  Smacked 
bottom for Peter Otten, how dare he? :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#63969

FromTim Chase <python.list@tim.thechases.com>
Date2014-01-15 05:03 -0600
Message-ID<mailman.5499.1389783753.18130.python-list@python.org>
In reply to#63955
On 2014-01-15 01:27, Steven D'Aprano wrote:
> class TextOnlyDict(dict):
>     def __setitem__(self, key, value):
>         if not isinstance(key, str):
>             raise TypeError
>         super().__setitem__(key, value)
>     # need to override more methods too
> 
> 
> But reading Guido, I think he's saying that wouldn't be a good
> idea. I don't get it -- it's not a violation of the Liskov
> Substitution Principle, because it's more restrictive, not less.
> What am I missing?

Just as an observation, this seems almost exactly what anydbm does,
behaving like a dict (whether it inherits from dict, or just
duck-types like a dict), but with the limitation that keys/values need
to be strings.

-tkc

[toc] | [prev] | [next] | [standalone]


#64041

FromDaniel da Silva <var.mail.daniel@gmail.com>
Date2014-01-15 19:50 -0500
Message-ID<mailman.5562.1389840613.18130.python-list@python.org>
In reply to#63955

[Multipart message — attachments visible in raw view] — view raw

On Tue, Jan 14, 2014 at 8:27 PM, Steven D'Aprano <
steve+comp.lang.python@pearwood.info> wrote:
>
> But reading Guido, I think he's saying that wouldn't be a good idea. I
> don't get it -- it's not a violation of the Liskov Substitution
> Principle, because it's more restrictive, not less. What am I missing?
>

Just to be pedantic, this *is* a violation of the Liskov Substution
Principle. According to Wikipedia, the principle states:

 if S is a subtype <http://en.wikipedia.org/wiki/Subtype> of T, then
> objects of type <http://en.wikipedia.org/wiki/Datatype> T may be replaced
> with objects of type S (i.e., objects of type S may be *substituted* for
> objects of type T) without altering any of the desirable properties of that
> program (correctness, task performed, etc.) [0]<http://en.wikipedia.org/wiki/Liskov_substitution_principle>


 Since S (TextOnlyDict) is more restrictive, it cannot be replaced for T
(dict) because the program may be using non-string keys.


Daniel

[toc] | [prev] | [next] | [standalone]


#64049

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2014-01-16 17:17 +1300
Message-ID<bjp4qqFi2tjU1@mid.individual.net>
In reply to#64041
Daniel da Silva wrote:

> Just to be pedantic, this /is/ a violation of the Liskov Substution 
> Principle. According to Wikipedia, the principle states:
> 
>      if S is a subtype <http://en.wikipedia.org/wiki/Subtype> of T, then
>     objects of type <http://en.wikipedia.org/wiki/Datatype> T may be
>     replaced with objects of type S (i.e., objects of type S may
>     be /substituted/ for objects of type T) without altering any of the
>     desirable properties of that program

Something everyone seems to miss when they quote the LSP
is that what the "desirable properties of the program" are
*depends on the program*.

Whenever you create a subclass, there is always *some*
difference between the behaviour of the subclass and
the base class, otherwise there would be no point in
having the subclass. Whether that difference has any
bad consequences for the program depends on what the
program does with the objects.

So you can't just look at S and T in isolation and
decide whether they satisfy the LSP or not. You need
to consider them in context.

In Python, there's a special problem with subclassing
dicts in particular: some of the core interpreter code
assumes a plain dict and bypasses the lookup of
__getitem__ and __setitem__, going straight to the
C-level implementations. If you tried to use a dict
subclass in that context that overrode those methods,
your overridden versions wouldn't get called.

But if you never use your dict subclass in that way,
there is no problem. Or if you don't override those
particular methods, there's no problem either.

If you're giving advice to someone who isn't aware
of all the fine details, "don't subclass dict" is
probably the safest thing to say. But there are
legitimate use cases for it if you know what you're
doing.

The other issue is that people are often tempted to
subclass dict in order to implement what isn't really
a dict at all, but just a custom mapping type. The
downside to that is that you end up inheriting a
bunch of dict-specific methods that don't really
make sense for your type. In that case it's usually
better to start with a fresh class that *uses* a
dict as part of its implementation, and only
exposes the methods that are really needed.

-- 
Greg

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web