Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #12458 > unrolled thread

Handling 2.7 and 3.0 Versions of Dict

Started byTravis Parks <jehugaleahsa@gmail.com>
First post2011-08-30 18:43 -0700
Last post2011-09-02 17:29 -0300
Articles 10 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Handling 2.7 and 3.0 Versions of Dict Travis Parks <jehugaleahsa@gmail.com> - 2011-08-30 18:43 -0700
    Re: Handling 2.7 and 3.0 Versions of Dict Terry Reedy <tjreedy@udel.edu> - 2011-08-30 23:33 -0400
    Re: Handling 2.7 and 3.0 Versions of Dict "Martin v. Loewis" <martin@v.loewis.de> - 2011-08-31 11:55 +0200
      Re: Handling 2.7 and 3.0 Versions of Dict Ian Kelly <ian.g.kelly@gmail.com> - 2011-08-31 09:44 -0600
        Re: Handling 2.7 and 3.0 Versions of Dict Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-09-01 11:37 +1200
          Re: Handling 2.7 and 3.0 Versions of Dict Travis Parks <jehugaleahsa@gmail.com> - 2011-08-31 18:28 -0700
            Re: Handling 2.7 and 3.0 Versions of Dict "Gabriel Genellina" <gagsl-py2@yahoo.com.ar> - 2011-09-02 13:36 -0300
              Re: Handling 2.7 and 3.0 Versions of Dict Travis Parks <jehugaleahsa@gmail.com> - 2011-09-02 09:53 -0700
                Re: Handling 2.7 and 3.0 Versions of Dict Terry Reedy <tjreedy@udel.edu> - 2011-09-02 16:22 -0400
                Re: Handling 2.7 and 3.0 Versions of Dict "Gabriel Genellina" <gagsl-py2@yahoo.com.ar> - 2011-09-02 17:29 -0300

#12458 — Handling 2.7 and 3.0 Versions of Dict

FromTravis Parks <jehugaleahsa@gmail.com>
Date2011-08-30 18:43 -0700
SubjectHandling 2.7 and 3.0 Versions of Dict
Message-ID<50924476-9ada-487e-bd4b-5b77bce11d5b@gz5g2000vbb.googlegroups.com>
I am writing a simple algorithms library that I want to work for both
Python 2.7 and 3.x. I am writing some functions like distinct, which
work with dictionaries under the hood. The problem I ran into is that
I am calling itervalues or values depending on which version of the
language I am working in. Here is the code I wrote to overcome it:

import sys
def getDictValuesFoo():
    if sys.version_info < (3,):
        return dict.itervalues
    else:
        return dict.values

getValues = getDictValuesFoo()

def distinct(iterable, keySelector = (lambda x: x)):
    lookup = {}
    for item in iterable:
        key = keySelector(item)
        if key not in lookup:
            lookup[key] = item
    return getValues(lookup)

I was surprised to learn that getValues CANNOT be called as if it were
a member of dict. I figured it was more efficient to determine what
getValues was once rather than every time it was needed.

First, how can I make the method getValues "private" _and_ so it only
gets evaluated once? Secondly, will the body of the distinct method be
evaluated immediately? How can I delay building the dict until the
first value is requested?

I noticed that hashing is a lot different in Python than it is in .NET
languages. .NET supports custom "equality comparers" that can override
a type's Equals and GetHashCode functions. This is nice when you can't
change the class you are hashing. That is why I am using a key
selector in my code, here. Is there a better way of overriding the
default hashing of a type without actually modifying its definition? I
figured a requesting a key was the easiest way.

[toc] | [next] | [standalone]


#12459

FromTerry Reedy <tjreedy@udel.edu>
Date2011-08-30 23:33 -0400
Message-ID<mailman.591.1314761672.27778.python-list@python.org>
In reply to#12458
On 8/30/2011 9:43 PM, Travis Parks wrote:
> I am writing a simple algorithms library that I want to work for both
> Python 2.7 and 3.x. I am writing some functions like distinct, which
> work with dictionaries under the hood. The problem I ran into is that
> I am calling itervalues or values depending on which version of the
> language I am working in. Here is the code I wrote to overcome it:
>
> import sys
> def getDictValuesFoo():
>      if sys.version_info<  (3,):
>          return dict.itervalues
>      else:
>          return dict.values

One alternative is to use itervalues and have 2to3 translate for you.
-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#12474

From"Martin v. Loewis" <martin@v.loewis.de>
Date2011-08-31 11:55 +0200
Message-ID<4E5E051B.8060002@v.loewis.de>
In reply to#12458
Am 31.08.2011 03:43, schrieb Travis Parks:
> I am writing a simple algorithms library that I want to work for both
> Python 2.7 and 3.x. I am writing some functions like distinct, which
> work with dictionaries under the hood. The problem I ran into is that
> I am calling itervalues or values depending on which version of the
> language I am working in. Here is the code I wrote to overcome it:
> 
> import sys
> def getDictValuesFoo():
>     if sys.version_info < (3,):
>         return dict.itervalues
>     else:
>         return dict.values
> 
> getValues = getDictValuesFoo()
> 
> def distinct(iterable, keySelector = (lambda x: x)):
>     lookup = {}
>     for item in iterable:
>         key = keySelector(item)
>         if key not in lookup:
>             lookup[key] = item
>     return getValues(lookup)
> 
> I was surprised to learn that getValues CANNOT be called as if it were
> a member of dict. I figured it was more efficient to determine what
> getValues was once rather than every time it was needed.
> 
> First, how can I make the method getValues "private" _and_ so it only
> gets evaluated once?

Not sure what "private" means here. Having the logic selected only once
goes like this

if sys.version_info < (3,):
  def getDictValues(dict):
      return dict.itervalues()
else:
  def getDictValues(dict):
      return dict.values()

> Secondly, will the body f the distinct method be
> evaluated immediately?

Yes.

> How can I delay building the dict until the first value is requested?

Make it a generator:

def distinct(iterable, keySelector = (lambda x: x)):
    lookup = {}
    for item in iterable:
        key = keySelector(item)
        if key not in lookup:
            lookup[key] = item
    for v in  getValues(lookup):
        yield v

This delays *building* the dictionary until the *first* value is
requested. I.e. it completes building the dictionary before the first
value is returned.

If you also want to interleave iteration over iterable with fetching
distinct values, write it like that:

def distinct(iterable, keySelector = (lambda x: x)):
    seen = {}
    for item in iterable:
        key = keySelector(item)
        if key not in seen:
            yield item
            seen[key] = item

> I noticed that hashing is a lot different in Python than it is in .NET
> languages. .NET supports custom "equality comparers" that can override
> a type's Equals and GetHashCode functions. This is nice when you can't
> change the class you are hashing. That is why I am using a key
> selector in my code, here. Is there a better way of overriding the
> default hashing of a type without actually modifying its definition? I
> figured a requesting a key was the easiest way.

You could provide a Key class that takes a hash function and a value
function:

class Key:
  def __init__(self, value, hash, eq):
    self.value, self.hash, self.eq = value, hash, eq
  def __hash__(self):
    return self.hash(self.value)
  def __eq__(self, other_key):
    return self.eq(self.value, other_key.value)

This class would then be used instead of your keySelector.

With that, you could change the dictionary to a set. Actually, you
could already do so in the second generator version:

def distinct(iterable, keySelector = (lambda x: x)):
    seen = set()
    for item in iterable:
        key = keySelector(item)
        if key not in seen:
            yield item
            seen.add(key) # item is not needed anymore

HTH,
Martin

[toc] | [prev] | [next] | [standalone]


#12491

FromIan Kelly <ian.g.kelly@gmail.com>
Date2011-08-31 09:44 -0600
Message-ID<mailman.611.1314805519.27778.python-list@python.org>
In reply to#12474
On Wed, Aug 31, 2011 at 3:55 AM, Martin v. Loewis <martin@v.loewis.de> wrote:
> if sys.version_info < (3,):
>  def getDictValues(dict):
>      return dict.itervalues()
> else:
>  def getDictValues(dict):
>      return dict.values()

The extra level of function call indirection is unnecessary here.
Better to write it as:

if sys.version_info < (3,):
    getDictValues = dict.itervalues
else:
    getDictValues = dict.values

(which is basically what the OP was doing in the first place).

>> I noticed that hashing is a lot different in Python than it is in .NET
>> languages. .NET supports custom "equality comparers" that can override
>> a type's Equals and GetHashCode functions. This is nice when you can't
>> change the class you are hashing. That is why I am using a key
>> selector in my code, here. Is there a better way of overriding the
>> default hashing of a type without actually modifying its definition? I
>> figured a requesting a key was the easiest way.
>
> You could provide a Key class that takes a hash function and a value
> function:
>
> class Key:
>  def __init__(self, value, hash, eq):
>    self.value, self.hash, self.eq = value, hash, eq
>  def __hash__(self):
>    return self.hash(self.value)
>  def __eq__(self, other_key):
>    return self.eq(self.value, other_key.value)
>
> This class would then be used instead of your keySelector.

For added value, you can make it a class factory so you don't have to
specify hash and eq over and over:

def Key(keyfunc):
    class Key:
        def __init__(self, value):
            self.value = value
        def __hash__(self):
            return hash(keyfunc(self.value))
        def __eq__(self, other):
            return keyfunc(self) == keyfunc(other)
    return Key

KeyTypeAlpha = Key(lambda x: x % 7)

items = set(KeyTypeAlpha(value) for value in sourceIterable)

Cheers,
Ian

[toc] | [prev] | [next] | [standalone]


#12533

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2011-09-01 11:37 +1200
Message-ID<9c7utuF8q8U1@mid.individual.net>
In reply to#12491
Ian Kelly wrote:

> if sys.version_info < (3,):
>     getDictValues = dict.itervalues
> else:
>     getDictValues = dict.values
> 
> (which is basically what the OP was doing in the first place).

And which he seemed to think didn't work for some
reason, but it seems fine as far as I can tell:

Python 2.7 (r27:82500, Oct 15 2010, 21:14:33)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> gv = dict.itervalues
 >>> d = {1:'a', 2:'b'}
 >>> gv(d)
<dictionary-valueiterator object at 0x2aa210>

% python3.1
Python 3.1.2 (r312:79147, Mar  2 2011, 17:43:12)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> gv = dict.values
 >>> d = {1:'a', 2:'b'}
 >>> gv(d)
dict_values(['a', 'b'])

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#12539

FromTravis Parks <jehugaleahsa@gmail.com>
Date2011-08-31 18:28 -0700
Message-ID<93c5378e-03f8-4764-96e8-2c3e8568baec@z8g2000yqe.googlegroups.com>
In reply to#12533
On Aug 31, 7:37 pm, Gregory Ewing <greg.ew...@canterbury.ac.nz> wrote:
> Ian Kelly wrote:
> > if sys.version_info < (3,):
> >     getDictValues = dict.itervalues
> > else:
> >     getDictValues = dict.values
>
> > (which is basically what the OP was doing in the first place).
>
> And which he seemed to think didn't work for some
> reason, but it seems fine as far as I can tell:
>
> Python 2.7 (r27:82500, Oct 15 2010, 21:14:33)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> gv = dict.itervalues
>  >>> d = {1:'a', 2:'b'}
>  >>> gv(d)
> <dictionary-valueiterator object at 0x2aa210>
>
> % python3.1
> Python 3.1.2 (r312:79147, Mar  2 2011, 17:43:12)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> gv = dict.values
>  >>> d = {1:'a', 2:'b'}
>  >>> gv(d)
> dict_values(['a', 'b'])
>
> --
> Greg

My problem was that I didn't understand the scoping rules. It is still
strange to me that the getValues variable is still in scope outside
the if/else branches.

[toc] | [prev] | [next] | [standalone]


#12650

From"Gabriel Genellina" <gagsl-py2@yahoo.com.ar>
Date2011-09-02 13:36 -0300
Message-ID<mailman.710.1314981394.27778.python-list@python.org>
In reply to#12539
En Wed, 31 Aug 2011 22:28:09 -0300, Travis Parks <jehugaleahsa@gmail.com>  
escribió:

> On Aug 31, 7:37 pm, Gregory Ewing <greg.ew...@canterbury.ac.nz> wrote:
>> Ian Kelly wrote:
>> > if sys.version_info < (3,):
>> >     getDictValues = dict.itervalues
>> > else:
>> >     getDictValues = dict.values
>>
>> > (which is basically what the OP was doing in the first place).
>>
> My problem was that I didn't understand the scoping rules. It is still
> strange to me that the getValues variable is still in scope outside
> the if/else branches.

Those if/else are at global scope. An 'if' statement does not introduce a  
new scope; so getDictValues, despite being "indented", is defined at  
global scope, and may be used anywhere in the module.

-- 
Gabriel Genellina

[toc] | [prev] | [next] | [standalone]


#12658

FromTravis Parks <jehugaleahsa@gmail.com>
Date2011-09-02 09:53 -0700
Message-ID<42b6b7cb-d659-4921-bec9-6a43671848e2@s20g2000yql.googlegroups.com>
In reply to#12650
On Sep 2, 12:36 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
> En Wed, 31 Aug 2011 22:28:09 -0300, Travis Parks <jehugalea...@gmail.com>  
> escribi :
>
> > On Aug 31, 7:37 pm, Gregory Ewing <greg.ew...@canterbury.ac.nz> wrote:
> >> Ian Kelly wrote:
> >> > if sys.version_info < (3,):
> >> >     getDictValues = dict.itervalues
> >> > else:
> >> >     getDictValues = dict.values
>
> >> > (which is basically what the OP was doing in the first place).
>
> > My problem was that I didn't understand the scoping rules. It is still
> > strange to me that the getValues variable is still in scope outside
> > the if/else branches.
>
> Those if/else are at global scope. An 'if' statement does not introduce a  
> new scope; so getDictValues, despite being "indented", is defined at  
> global scope, and may be used anywhere in the module.
>
> --
> Gabriel Genellina
>
>

Does that mean the rules would be different inside a function?

[toc] | [prev] | [next] | [standalone]


#12672

FromTerry Reedy <tjreedy@udel.edu>
Date2011-09-02 16:22 -0400
Message-ID<mailman.719.1314995014.27778.python-list@python.org>
In reply to#12658
On 9/2/2011 12:53 PM, Travis Parks wrote:
> On Sep 2, 12:36 pm, "Gabriel Genellina"<gagsl-...@yahoo.com.ar>

>> Those if/else are at global scope. An 'if' statement does not introduce a
>> new scope; so getDictValues, despite being "indented", is defined at
>> global scope, and may be used anywhere in the module.

> Does that mean the rules would be different inside a function?

Yes. Inside a function, you would have to add
     global getDictValues
before the if statement in order for the assignments to have global effect.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#12673

From"Gabriel Genellina" <gagsl-py2@yahoo.com.ar>
Date2011-09-02 17:29 -0300
Message-ID<mailman.720.1314995369.27778.python-list@python.org>
In reply to#12658
En Fri, 02 Sep 2011 13:53:37 -0300, Travis Parks <jehugaleahsa@gmail.com>  
escribió:

> On Sep 2, 12:36 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
> wrote:
>> En Wed, 31 Aug 2011 22:28:09 -0300, Travis Parks  
>> <jehugalea...@gmail.com> escribi :
>>
>> > On Aug 31, 7:37 pm, Gregory Ewing <greg.ew...@canterbury.ac.nz> wrote:
>> >> Ian Kelly wrote:
>> >> > if sys.version_info < (3,):
>> >> >     getDictValues = dict.itervalues
>> >> > else:
>> >> >     getDictValues = dict.values
>>
>> >> > (which is basically what the OP was doing in the first place).
>>
>> > My problem was that I didn't understand the scoping rules. It is still
>> > strange to me that the getValues variable is still in scope outside
>> > the if/else branches.
>>
>> Those if/else are at global scope. An 'if' statement does not introduce  
>> a new scope; so getDictValues, despite being "indented", is defined at  
>> global scope, and may be used anywhere in the module.
>
> Does that mean the rules would be different inside a function?

Yes: a function body *does* create a new scope, as well as the class  
statement. See
http://docs.python.org/reference/executionmodel.html

-- 
Gabriel Genellina

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web