Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #90578 > unrolled thread
| Started by | BartC <bc@freeuk.com> |
|---|---|
| First post | 2015-05-13 20:36 +0100 |
| Last post | 2015-05-17 14:41 +0100 |
| Articles | 17 on this page of 57 — 13 participants |
Back to article view | Back to comp.lang.python
Building CPython BartC <bc@freeuk.com> - 2015-05-13 20:36 +0100
Re: Building CPython Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-13 20:58 +0100
Re: Building CPython Terry Reedy <tjreedy@udel.edu> - 2015-05-13 18:34 -0400
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-14 16:51 +0100
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 02:09 +1000
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-14 18:02 +0100
Re: Building CPython Dave Angel <davea@davea.name> - 2015-05-14 13:10 -0400
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 03:11 +1000
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-14 18:32 +0100
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 03:45 +1000
Re: Building CPython Terry Reedy <tjreedy@udel.edu> - 2015-05-14 14:50 -0400
Re: Building CPython Christian Gollwitzer <auriocus@gmx.de> - 2015-05-15 12:51 +0200
Re: Building CPython Terry Reedy <tjreedy@udel.edu> - 2015-05-15 17:19 -0400
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-14 19:29 +0300
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-14 22:55 +0100
Re: Building CPython MRAB <python@mrabarnett.plus.com> - 2015-05-14 23:19 +0100
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-15 01:50 +0100
Re: Building CPython Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-05-15 18:05 +1200
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-15 11:59 +0300
Re: Building CPython wxjmfauth@gmail.com - 2015-05-15 02:07 -0700
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-15 12:20 +0300
Re: Building CPython wxjmfauth@gmail.com - 2015-05-15 02:51 -0700
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-15 13:52 +0300
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-15 22:10 +1000
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 22:34 +1000
Re: Building CPython wxjmfauth@gmail.com - 2015-05-15 07:11 -0700
Re: Building CPython Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-15 13:41 +0100
Re: Building CPython Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-15 13:38 +0100
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 19:43 +1000
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-15 13:50 +0300
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-15 22:43 +1000
Re: Building CPython Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-15 09:00 -0600
Re: Building CPython Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-15 09:04 -0600
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-16 01:06 +1000
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-15 20:14 +1000
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-15 20:25 +1000
Re: Building CPython Terry Reedy <tjreedy@udel.edu> - 2015-05-15 17:05 -0400
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-15 22:54 +0100
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-16 01:44 +0300
Re: Building CPython Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-16 00:27 +0100
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-16 11:55 +1000
Re: Building CPython Chris Angelico <rosuav@gmail.com> - 2015-05-16 12:15 +1000
Re: Building CPython Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-16 03:17 +0100
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-16 01:43 +0100
Re: Building CPython MRAB <python@mrabarnett.plus.com> - 2015-05-16 02:16 +0100
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-16 11:08 +0300
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-16 19:40 +1000
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-16 16:59 +0300
Re: Building CPython Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-17 04:18 +1000
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-16 21:55 +0300
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-17 01:51 +0300
Re: Building CPython Marko Rauhamaa <marko@pacujo.net> - 2015-05-17 23:49 +0300
Re: Building CPython Terry Reedy <tjreedy@udel.edu> - 2015-05-15 19:54 -0400
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-15 10:32 +0100
Re: Building CPython Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-05-16 12:55 +1200
Re: Building CPython Jonas Wielicki <jonas@wielicki.name> - 2015-05-17 14:25 +0200
Re: Building CPython BartC <bc@freeuk.com> - 2015-05-17 14:41 +0100
Page 3 of 3 — ← Prev page 1 2 [3]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-05-16 11:55 +1000 |
| Message-ID | <5556a3a5$0$12989$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #90697 |
On Sat, 16 May 2015 09:27 am, Mark Lawrence wrote:
> On 15/05/2015 23:44, Marko Rauhamaa wrote:
>> BartC <bc@freeuk.com>:
>>
>>> What /is/ a method lookup? Is it when you have this:
>>>
>>> A.B()
>>>
>>> and need to find whether the expression A (or its class or type) has a
>>> name B associated with it? (And it then needs to check whether B is
>>> something that can be called.)
>>>
>>> If so, does that have to be done using Python's Dict mechanism? (Ie.
>>> searching for a key 'B' by name and seeing if the object associated
>>> with it is a method. That does not sound efficient.)
It's not as inefficient as you may think. Dicts are hash tables, and hash
tables are a standard computer science data structure for performing very
fast searches at constant (or near constant) time.
The dict is basically an array of pointers to (key, value). To look a name
up in the dict, you hash the string which gives you an index into the
array, then look at that position. If it is blank, you know there is no
match. If it points to a string, you compare that string to your string. If
they are equal, then you have a match. If they aren't equal, you have a
collision, and you have to look elsewhere (details differ) but typically
you don't end up looking in more than one or two positions. So all pretty
fast, and close enough to constant time.
To speed things up even further, I think that the hash value is cached with
the string, so it only needs to be calculated the first time.
>> That is a general feature among high-level programming languages. In
>> Python, it is even more complicated:
>>
>> * first the object's dict is looked up for the method name
>>
>> * if the method is not found (it usually isn't), the dict of the
>> object's class is consulted
>>
>> * if the method is found (it usually is), a function object is
>> instantiated that delegates to the class's method and embeds a "self"
>> reference to the object to the call
It's the other way around. The function object already exists: you created
it by writing `def method(self, *args): ... ` inside the class body. def
always makes a function. It's the *method* object which is created on the
fly, delegating to the function.
>> IOW, two dict lookups plus an object construction for each method call.
>>
>>
>> Marko
>>
>
> As a picture paints a thousand words is anybody aware of a site or sites
> that show this diagramatically, as I think I and possibly others would
> find it far easier to grasp.
No I'm not aware of any such site, but I can try to make it more obvious
with an example.
Suppose we have a hierarchy of classes, starting from the root of the
hierarchy (object) to a specific instance:
class Animal(object):
pass
class Mammal(Animal):
pass
class Dog(Mammal):
def bark(self): ...
laddy = Dog()
We then look up a method:
laddy.bark()
In a non-dynamic language like Java, the compiler knows exactly where bark
is defined ("in the Dog class") and can call it directly. In dynamic
languages like Python, the compiler can't be sure that bark hasn't been
shadowed or overridden at runtime, so it has to search for the first match
found. Simplified:
* Does laddy.__dict__ contain the key "bark"? If so, we have a match.
* For each class in the MRO (Method Resolution Order), namely
[Dog, Mammal, Animal, object], does the class __dict__ contain the
key "bark"? If so, we have a match.
* Do any of those classes in the MRO have a __getattr__ method? If
so, then try calling __getattr__("bark") and see if it returns
a match.
* Give up and raise AttributeError.
[Aside: some of the things I haven't covered: __slots__, __getattribute__,
how metaclasses can affect this.]
In the case of laddy.bark, the matching attribute is found as
Dog.__dict__['bark']:
temp = Dog.__bark__['bark'] # laddy.bark, is a function
At this point, the descriptor protocol applies. You can ignore this part if
you like, and just pretend that laddy.bark returns a method instead of a
function, but if you want to know what actually happens in all it's gory
details, it is something like this (again, simplified):
* Does the attribute have a __get__ method? If not, then we just
use the object as-is, with no changes.
* But if it does have a __get__ method, then it is a descriptor and
we call the __get__ method to get the object we actually use.
Since functions are descriptors, we get this:
temp = temp.__get__(laddy, Dog) # returns a method object
and finally we can call the method:
temp() # laddy.bark()
None of these individual operations are particularly expensive, nor are
there a lot of them. For a typical instance, the MRO usually contains only
two or three classes, and __dict__ lookups are fast. Nevertheless, even
though each method lookup is individually *almost* as fast as the sort of
pre-compiled all-but-instantaneous access which Java can do, it all adds
up. So in Java, a long chain of dots:
foo.bar.baz.foobar.spam.eggs.cheese
can be resolved at compile-time, and takes no more time than
foo.cheese
but in Python's case it has to be resolved at run-time, so if you care about
speed, you should try to avoid long chains of dots in performance critical
loops. E.g. instead of:
for x in sequence:
foo.bar.baz.foobar.spam.eggs.cheese(x)
you can write:
cheese = foo.bar.baz.foobar.spam.eggs.cheese
for x in sequence:
cheese(x)
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-05-16 12:15 +1000 |
| Message-ID | <mailman.57.1431742547.17265.python-list@python.org> |
| In reply to | #90709 |
On Sat, May 16, 2015 at 11:55 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> but in Python's case it has to be resolved at run-time, so if you care about
> speed, you should try to avoid long chains of dots in performance critical
> loops. E.g. instead of:
>
> for x in sequence:
> foo.bar.baz.foobar.spam.eggs.cheese(x)
>
> you can write:
>
> cheese = foo.bar.baz.foobar.spam.eggs.cheese
> for x in sequence:
> cheese(x)
Like all advice, of course, this should not be taken on its own. Some
code tries to avoid long chains of dots by adding wrapper methods:
for x in sequence:
foo.cheese(x)
where foo.cheese() delegates to self.bar.cheese() and so on down the
line. That, of course, will be far FAR slower, in pretty much any
language (unless the compiler can in-line the code completely, in
which case it's effectively being optimized down to the original
anyway); the dots aren't as bad as you might think :)
Just always remember the one cardinal rule of optimization: MEASURE
FIRST. You have no idea how slow something is until you measure it.
(I'm not addressing my comments to Steven here, who I'm fairly sure is
aware of all of this(!), but it's his post that gave the example that
I'm quoting.)
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-05-16 03:17 +0100 |
| Message-ID | <mailman.59.1431742804.17265.python-list@python.org> |
| In reply to | #90709 |
On 16/05/2015 02:55, Steven D'Aprano wrote:
> On Sat, 16 May 2015 09:27 am, Mark Lawrence wrote:
>
>> On 15/05/2015 23:44, Marko Rauhamaa wrote:
>>> BartC <bc@freeuk.com>:
>>>
>>>> What /is/ a method lookup? Is it when you have this:
>>>>
>>>> A.B()
>>>>
>>>> and need to find whether the expression A (or its class or type) has a
>>>> name B associated with it? (And it then needs to check whether B is
>>>> something that can be called.)
>>>>
>>>> If so, does that have to be done using Python's Dict mechanism? (Ie.
>>>> searching for a key 'B' by name and seeing if the object associated
>>>> with it is a method. That does not sound efficient.)
>
> It's not as inefficient as you may think. Dicts are hash tables, and hash
> tables are a standard computer science data structure for performing very
> fast searches at constant (or near constant) time.
>
> The dict is basically an array of pointers to (key, value). To look a name
> up in the dict, you hash the string which gives you an index into the
> array, then look at that position. If it is blank, you know there is no
> match. If it points to a string, you compare that string to your string. If
> they are equal, then you have a match. If they aren't equal, you have a
> collision, and you have to look elsewhere (details differ) but typically
> you don't end up looking in more than one or two positions. So all pretty
> fast, and close enough to constant time.
>
> To speed things up even further, I think that the hash value is cached with
> the string, so it only needs to be calculated the first time.
>
>
>
>>> That is a general feature among high-level programming languages. In
>>> Python, it is even more complicated:
>>>
>>> * first the object's dict is looked up for the method name
>>>
>>> * if the method is not found (it usually isn't), the dict of the
>>> object's class is consulted
>>>
>>> * if the method is found (it usually is), a function object is
>>> instantiated that delegates to the class's method and embeds a "self"
>>> reference to the object to the call
>
> It's the other way around. The function object already exists: you created
> it by writing `def method(self, *args): ... ` inside the class body. def
> always makes a function. It's the *method* object which is created on the
> fly, delegating to the function.
>
>
>
>>> IOW, two dict lookups plus an object construction for each method call.
>>>
>>>
>>> Marko
>>>
>>
>> As a picture paints a thousand words is anybody aware of a site or sites
>> that show this diagramatically, as I think I and possibly others would
>> find it far easier to grasp.
>
> No I'm not aware of any such site, but I can try to make it more obvious
> with an example.
>
> Suppose we have a hierarchy of classes, starting from the root of the
> hierarchy (object) to a specific instance:
>
> class Animal(object):
> pass
>
> class Mammal(Animal):
> pass
>
> class Dog(Mammal):
> def bark(self): ...
>
> laddy = Dog()
>
>
> We then look up a method:
>
> laddy.bark()
>
> In a non-dynamic language like Java, the compiler knows exactly where bark
> is defined ("in the Dog class") and can call it directly. In dynamic
> languages like Python, the compiler can't be sure that bark hasn't been
> shadowed or overridden at runtime, so it has to search for the first match
> found. Simplified:
>
> * Does laddy.__dict__ contain the key "bark"? If so, we have a match.
>
> * For each class in the MRO (Method Resolution Order), namely
> [Dog, Mammal, Animal, object], does the class __dict__ contain the
> key "bark"? If so, we have a match.
>
> * Do any of those classes in the MRO have a __getattr__ method? If
> so, then try calling __getattr__("bark") and see if it returns
> a match.
>
> * Give up and raise AttributeError.
>
> [Aside: some of the things I haven't covered: __slots__, __getattribute__,
> how metaclasses can affect this.]
>
> In the case of laddy.bark, the matching attribute is found as
> Dog.__dict__['bark']:
>
> temp = Dog.__bark__['bark'] # laddy.bark, is a function
>
> At this point, the descriptor protocol applies. You can ignore this part if
> you like, and just pretend that laddy.bark returns a method instead of a
> function, but if you want to know what actually happens in all it's gory
> details, it is something like this (again, simplified):
>
> * Does the attribute have a __get__ method? If not, then we just
> use the object as-is, with no changes.
>
> * But if it does have a __get__ method, then it is a descriptor and
> we call the __get__ method to get the object we actually use.
>
>
> Since functions are descriptors, we get this:
>
> temp = temp.__get__(laddy, Dog) # returns a method object
>
> and finally we can call the method:
>
> temp() # laddy.bark()
>
>
> None of these individual operations are particularly expensive, nor are
> there a lot of them. For a typical instance, the MRO usually contains only
> two or three classes, and __dict__ lookups are fast. Nevertheless, even
> though each method lookup is individually *almost* as fast as the sort of
> pre-compiled all-but-instantaneous access which Java can do, it all adds
> up. So in Java, a long chain of dots:
>
> foo.bar.baz.foobar.spam.eggs.cheese
>
> can be resolved at compile-time, and takes no more time than
>
> foo.cheese
>
> but in Python's case it has to be resolved at run-time, so if you care about
> speed, you should try to avoid long chains of dots in performance critical
> loops. E.g. instead of:
>
> for x in sequence:
> foo.bar.baz.foobar.spam.eggs.cheese(x)
>
> you can write:
>
> cheese = foo.bar.baz.foobar.spam.eggs.cheese
> for x in sequence:
> cheese(x)
>
Thanks for taking the time over this, I'll study it later today after
hopefully a good night's sleep :)
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-05-16 01:43 +0100 |
| Message-ID | <bpw5x.561870$Zf4.150532@fx22.am4> |
| In reply to | #90696 |
On 15/05/2015 23:44, Marko Rauhamaa wrote: > BartC <bc@freeuk.com>: > >> What /is/ a method lookup? Is it when you have this: >> >> A.B() >> >> and need to find whether the expression A (or its class or type) has a >> name B associated with it? (And it then needs to check whether B is >> something that can be called.) >> >> If so, does that have to be done using Python's Dict mechanism? (Ie. >> searching for a key 'B' by name and seeing if the object associated >> with it is a method. That does not sound efficient.) > > That is a general feature among high-level programming languages. In > Python, it is even more complicated: > > * first the object's dict is looked up for the method name > > * if the method is not found (it usually isn't), the dict of the > object's class is consulted > > * if the method is found (it usually is), a function object is > instantiated that delegates to the class's method and embeds a "self" > reference to the object to the call > > IOW, two dict lookups plus an object construction for each method call. OK, I didn't know that objects have their own set of attributes that are distinct from the class they belong to. I really ought to learn more Python!. (Yet, I have this crazy urge now to create my own bytecode interpreter for, if not exactly Python itself, then an equivalent language. Just to see if I can do any better than CPython, given the same language restraints. Although I'm hampered a little by not knowing Python well enough. Nor OOP, but those are minor details... Anyway it sounds more fun than trying to decipher the layers of macros and conditional code that appear to be the CPython sources.) > IOW, two dict lookups plus an object construction for each method call. I suppose in many cases an object will have no attributes of its own, and so it can rapidly bypass the first lookup. I don't understand the need for an object creation (to represent A.B so that it can call it?) but perhaps such an object can already exist, prepared ready for use. -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2015-05-16 02:16 +0100 |
| Message-ID | <mailman.53.1431739023.17265.python-list@python.org> |
| In reply to | #90701 |
On 2015-05-16 01:43, BartC wrote:
> On 15/05/2015 23:44, Marko Rauhamaa wrote:
>> BartC <bc@freeuk.com>:
>>
>>> What /is/ a method lookup? Is it when you have this:
>>>
>>> A.B()
>>>
>>> and need to find whether the expression A (or its class or type) has a
>>> name B associated with it? (And it then needs to check whether B is
>>> something that can be called.)
>>>
>>> If so, does that have to be done using Python's Dict mechanism? (Ie.
>>> searching for a key 'B' by name and seeing if the object associated
>>> with it is a method. That does not sound efficient.)
>>
>> That is a general feature among high-level programming languages. In
>> Python, it is even more complicated:
>>
>> * first the object's dict is looked up for the method name
>>
>> * if the method is not found (it usually isn't), the dict of the
>> object's class is consulted
>>
>> * if the method is found (it usually is), a function object is
>> instantiated that delegates to the class's method and embeds a "self"
>> reference to the object to the call
>>
>> IOW, two dict lookups plus an object construction for each method call.
>
> OK, I didn't know that objects have their own set of attributes that are
> distinct from the class they belong to. I really ought to learn more
> Python!.
>
> (Yet, I have this crazy urge now to create my own bytecode interpreter
> for, if not exactly Python itself, then an equivalent language. Just to
> see if I can do any better than CPython, given the same language
> restraints.
>
> Although I'm hampered a little by not knowing Python well enough. Nor
> OOP, but those are minor details... Anyway it sounds more fun than
> trying to decipher the layers of macros and conditional code that appear
> to be the CPython sources.)
>
> > IOW, two dict lookups plus an object construction for each method call.
>
> I suppose in many cases an object will have no attributes of its own,
> and so it can rapidly bypass the first lookup. I don't understand the
> need for an object creation (to represent A.B so that it can call it?)
> but perhaps such an object can already exist, prepared ready for use.
>
It's possible to do:
f = A.B
...
f()
so it's necessary to have an object for A.B.
The question is how much you would gain from optimising A.B() as a
special case (increase in speed vs increase in complexity).
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-05-16 11:08 +0300 |
| Message-ID | <87egmhrll2.fsf@elektro.pacujo.net> |
| In reply to | #90701 |
BartC <bc@freeuk.com>:
> I suppose in many cases an object will have no attributes of its own,
> and so it can rapidly bypass the first lookup.
Almost all objects have quite many instance attributes. That's what
tells objects apart.
> I don't understand the need for an object creation (to represent A.B
> so that it can call it?) but perhaps such an object can already exist,
> prepared ready for use.
Note that almost identical semantics could be achieved without a class.
Thus, these two constructs are almost identical:
class C:
def __init__(self, x):
self.x = x
def square(self):
return self.x * self.x
def cube(self):
return self.x * self.square()
##
class O: pass
def C(x):
o = O()
def square():
return x * x
def cube():
return x * square()
o.square = square
o.cube = cube
return o
IOW, the class is a virtually superfluous concept in Python. Python has
gotten it probably without much thought (other languages at the time had
it). I comes with advantages and disadvantages:
+ improves readability
+ makes objects slightly smaller
+ makes object instantiation slightly faster
- goes against the grain of ducktyping
- makes method calls slower
- makes method call semantics a bit tricky
Marko
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-05-16 19:40 +1000 |
| Message-ID | <555710a8$0$13001$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #90726 |
On Sat, 16 May 2015 06:08 pm, Marko Rauhamaa wrote:
> Note that almost identical semantics could be achieved without a class.
> Thus, these two constructs are almost identical:
[...]
> IOW, the class is a virtually superfluous concept in Python. Python has
> gotten it probably without much thought (other languages at the time had
> it). I comes with advantages and disadvantages:
Your example is effectively just a way of using closures instead of a class
instance.
Almost anything you can do with classes, you can do with closures. The big
advantage of classes over closures is that you have an interface to access
arbitrary class attributes, while you would need a separate closure for
each and every attribute you want access to.
For example, here is sketch:
class K:
def method(self, arg):
return self.spam + arg
k = K()
the_method = k.method # bound method
as a closure becomes:
def make_closure(instance): # instance is equivalent to self above
def method(arg):
return instance.spam + arg
return method
the_method = make_closure(obj) # Some object with a spam field.
The big advantage of a closure is that you have much more strict
encapsulation. The big disadvantage of a closure is that you have much more
strict encapsulation.
> + improves readability
I wouldn't say that.
> + makes objects slightly smaller
>
> + makes object instantiation slightly faster
Are you sure? Have you actually benchmarked this?
> - goes against the grain of ducktyping
>
> - makes method calls slower
>
> - makes method call semantics a bit tricky
A couple more negatives:
- no such thing as inheritance;
- "is-a" relationship tests don't work;
- an unfamiliar idiom for most people;
Also, at least with Python's implementation, a couple of mixed blessings:
± closures are closed against modification;
± internals of the closure are strictly private;
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-05-16 16:59 +0300 |
| Message-ID | <878ucosjwh.fsf@elektro.pacujo.net> |
| In reply to | #90728 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > A couple more negatives: > > - no such thing as inheritance; Untrue. My simple Scheme object system (125 lines incl. documentation) supports multiple inheritance without classes. Maybe I should port that to Python... > - "is-a" relationship tests don't work; From the ducktyping point of view, that is an advantage. The whole Linnaean categorization of objects is unnecessary ontological chaff. How many times have people here had to advise newcomers not to inspect type and instance relations of objects and just call the method? > - an unfamiliar idiom for most people; That's impossible to ascertain objectively. Java and JavaScript programmers (of all people!) routinely deal with closures. Marko
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-05-17 04:18 +1000 |
| Message-ID | <55578a02$0$13003$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #90732 |
On Sat, 16 May 2015 11:59 pm, Marko Rauhamaa wrote: > Steven D'Aprano <steve+comp.lang.python@pearwood.info>: > >> A couple more negatives: >> >> - no such thing as inheritance; > > Untrue. My simple Scheme object system (125 lines incl. documentation) Ah yes, I've seen Javascript code like that too. Each line is thirty thousand characters long... *wink* > supports multiple inheritance without classes. Maybe I should port that > to Python... I'd like to see it, but somehow I don't think that your "Scheme object system" is another name for "closures". We were talking about closures, weren't we? It sounds like you have implemented a form of prototype-based object programming. The question of whether prototype-OOP has inheritance is an interesting one. Clearly prototypes implement something *like* inheritance, but it is based in delegation or copying. Delegation-based cloning is quite close to class-based inheritance, but copying-based cloning is not. My sense is that I prefer to say that prototypes don't have inheritance in the same sense as classes, but they have something that plays the same role as inheritance. But if you want to call it inheritance, I can't really argue. >> - "is-a" relationship tests don't work; > > From the ducktyping point of view, that is an advantage. The whole > Linnaean categorization of objects is unnecessary ontological chaff. How > many times have people here had to advise newcomers not to inspect type > and instance relations of objects and just call the method? Um, is that a trick question? I don't remember the last time. >> - an unfamiliar idiom for most people; > > That's impossible to ascertain objectively. Java and JavaScript > programmers (of all people!) routinely deal with closures. I don't think so. I can't say for Javascript, but for Java, there's a lot of confusion around closures, they've been described as "evil" with the recommendation not to use them, and people cannot even agree when they were introduced! http://java.dzone.com/articles/whats-wrong-java-8-currying-vs www.javaworld.com/javaworld/jw-06-2008/jw-06-closures.html I mean, people had to *debate* the introduction of closures? There were three competing proposals for them, plus an argument for "don't add them". Some people say closures were added in Java 7, others say closures have been there since the beginning, and James Gosling himself says that Java used inner classes instead of closures but the result was painful... It seems to me that in the Java community, there's a lot of confusion over closures, which are seen as an advanced (and rather scary) feature. Hardly routine. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-05-16 21:55 +0300 |
| Message-ID | <87zj54qrlw.fsf@elektro.pacujo.net> |
| In reply to | #90749 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info>:
> On Sat, 16 May 2015 11:59 pm, Marko Rauhamaa wrote:
>> supports multiple inheritance without classes. Maybe I should port that
>> to Python...
>
> I'd like to see it, but somehow I don't think that your "Scheme object
> system" is another name for "closures". We were talking about closures,
> weren't we?
Ok, here's a quick port that have barely tried out:
========================================================================
### Simple OO Framework
class _O: pass
def make_object(*procedures, base=None, bases=None):
o = _O()
methods = {}
setattr(o, '%methods', methods)
if base is not None:
inherit_single(o, base)
elif bases is not None:
inherit_multi(o, bases)
for procedure in procedures:
methods[procedure.__name__] = procedure
setattr(o, procedure.__name__, procedure)
return o
def inherit_single(o, base):
methods = getattr(o, '%methods')
for name, method in getattr(base, '%methods').items():
methods[name] = method
setattr(o, name, method)
def inherit_multi(o, bases):
for base in bases:
inherit_single(o, base)
### Used as follows
def TCPClient():
def connect(socket_address):
...
return make_object(connect)
def SMTPClient():
tcp_client = TCPClient()
def connect(host):
tcp_client.connect((host, 25))
def send_message(message):
...
return make_object(send_message, base=tcp_client)
client = SMTPClient()
client.connect('mail.example.com')
========================================================================
> I mean, people had to *debate* the introduction of closures? There were
> three competing proposals for them, plus an argument for "don't add them".
> Some people say closures were added in Java 7, others say closures have
> been there since the beginning, and James Gosling himself says that Java
> used inner classes instead of closures but the result was painful...
I'm with those who say anonymous and named inner classes have been there
forever and serve the purpose of closures. Yes, Java's boilerplate
requirements are painful, but if you don't like that, use Python.
> It seems to me that in the Java community, there's a lot of confusion over
> closures, which are seen as an advanced (and rather scary) feature. Hardly
> routine.
Importantly, anonymous inner classes have been in active use by Java
newbs from day one.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-05-17 01:51 +0300 |
| Message-ID | <87lhgoqgpv.fsf@elektro.pacujo.net> |
| In reply to | #90750 |
Marko Rauhamaa <marko@pacujo.net>:
> Ok, here's a quick port that have barely tried out:
And here's a more complete port (with some possible dunder abuse):
========================================================================
### Simple OO Framework
class _O: pass
def make_object(*procedures, base=None, bases=None):
o = _O()
methods = {}
o.__methods__ = methods
o.__derived__ = None
if base is not None:
_inherit_single(o, base)
elif bases is not None:
_inherit_multi(o, bases)
for procedure in procedures:
methods[procedure.__name__] = procedure
def method(*args, __procedure__=procedure, __dispatch__=True, **kwargs):
if not __dispatch__ or o.__derived__ is None:
return __procedure__(*args, **kwargs)
derived = o
while derived.__derived__ is not None:
derived = derived.__derived__
return getattr(derived, __procedure__.__name__)(*args, **kwargs)
setattr(o, procedure.__name__, method)
return o
def _inherit_single(o, base):
methods = o.__methods__
for name, method in base.__methods__.items():
methods[name] = method
setattr(o, name, method)
def _inherit_multi(o, bases):
for base in bases:
_inherit_single(o, base)
def delegate(method, *args, **kwargs):
return method(*args, __dispatch__=False, **kwargs)
### Used as follows
def TCPClient():
def connect(address):
pass
def shut_down():
pass
return make_object(connect, shut_down)
def SMTPClient():
tcp_client = TCPClient()
def connect(address):
delegate(tcp_client.connect, address)
do_stuff()
def send_message(message):
pass
return make_object(connect, send_message, base=tcp_client)
client = SMTPClient()
client.connect(None)
========================================================================
Marko
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2015-05-17 23:49 +0300 |
| Message-ID | <87mw13orp0.fsf@elektro.pacujo.net> |
| In reply to | #90759 |
Iterating. A line still missing:
> ========================================================================
> ### Simple OO Framework
>
> class _O: pass
>
> def make_object(*procedures, base=None, bases=None):
> o = _O()
> methods = {}
> o.__methods__ = methods
> o.__derived__ = None
> if base is not None:
> _inherit_single(o, base)
> elif bases is not None:
> _inherit_multi(o, bases)
> for procedure in procedures:
> methods[procedure.__name__] = procedure
> def method(*args, __procedure__=procedure, __dispatch__=True, **kwargs):
> if not __dispatch__ or o.__derived__ is None:
> return __procedure__(*args, **kwargs)
> derived = o
> while derived.__derived__ is not None:
> derived = derived.__derived__
> return getattr(derived, __procedure__.__name__)(*args, **kwargs)
> setattr(o, procedure.__name__, method)
> return o
>
> def _inherit_single(o, base):
> methods = o.__methods__
> for name, method in base.__methods__.items():
> methods[name] = method
> setattr(o, name, method)
base.__derived__ = o
>
> def _inherit_multi(o, bases):
> for base in bases:
> _inherit_single(o, base)
>
> def delegate(method, *args, **kwargs):
> return method(*args, __dispatch__=False, **kwargs)
>
> ### Used as follows
>
> def TCPClient():
> def connect(address):
> pass
> def shut_down():
> pass
> return make_object(connect, shut_down)
>
> def SMTPClient():
> tcp_client = TCPClient()
> def connect(address):
> delegate(tcp_client.connect, address)
> do_stuff()
> def send_message(message):
> pass
> return make_object(connect, send_message, base=tcp_client)
>
> client = SMTPClient()
> client.connect(None)
> ========================================================================
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-05-15 19:54 -0400 |
| Message-ID | <mailman.51.1431734090.17265.python-list@python.org> |
| In reply to | #90694 |
On 5/15/2015 5:54 PM, BartC wrote: > What /is/ a method lookup? Is it when you have this: > > A.B() This is parsed as (A.B)() > and need to find whether the expression A (or its class or type) has a > name B associated with it? Yes. Dotted names imply an attribute lookup. > (And it then needs to check whether B is something that can be called.) The object resulting from the attribute lookup, A.B (not B exactly), is called in a separate operation (with a separate bytecode). It depends on the object having a .__call__ method. The .__call__ method is *executed* (rather than *called*, which would lead to infinite regress). -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-05-15 10:32 +0100 |
| Message-ID | <Z2j5x.640864$ms.43141@fx30.am4> |
| In reply to | #90649 |
On 15/05/2015 07:05, Gregory Ewing wrote: > BartC wrote: >> It appears to be those "<=" and "+" operations in the code above where >> much of the time is spent. When I trace out the execution paths a bit >> more, I'll have a better picture of how many lines of C code are >> involved in each iteration. > > The path from decoding a bytecode to the C code that > implements it can be rather convoluted, but there are > reasons for each of the complications -- mainly to > do with supporting the ability to override operators > with C and/or Python code. > > If you removed those abilities, the implemention > would be simpler, and possibly faster. But then the > language wouldn't be Python any more. That's the challenge; programs must still work as they did before. (But I suppose it can be exasperating for CPython developers if 99% of programs could be made faster but can't be because of the 1% that rely on certain features). Still, I'm just seeing what there is in CPython that limits the performance, and whether anything can be done /easily/ without going to solutions such as PyPy which are unsatisfactory IMO (by being even more fantastically complicated, but they don't always give a speed-up either). For example, there is a /specific/ byte-code called BINARY_ADD, which then proceeds to call a /generic/ binary-op handler! This throws away the advantage of knowing at byte-code generation time exactly which operation is needed. (Unless I'm just looking at a bunch of macros or maybe there is some inlining going on with the compiler reducing all those table-indexing operations for 'Add', with direct accesses, but it didn't look like it. I'm just glad it doesn't use C++ which would have made it truly impossible to figure out what's going on.) (BTW since I'm having to use Linux to compile this anyway, is there a tool available that will tell me whether something in the C sources is a function or macro? (And preferably where the definition might be located.)) -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2015-05-16 12:55 +1200 |
| Message-ID | <crnisnF2efoU1@mid.individual.net> |
| In reply to | #90659 |
BartC wrote: > For example, there is a /specific/ byte-code called BINARY_ADD, which > then proceeds to call a /generic/ binary-op handler! This throws away > the advantage of knowing at byte-code generation time exactly which > operation is needed. While inlining the binary-op handling might give you a slightly shorter code path, it wouldn't necessarily speed anything up. It's possible, for example, that the shared binary-op handler fits in the instruction cache, but the various inlined copies of it don't, leading to a slowdown. The only way to be sure about things like that is to try them and measure. The days when you could predict the speed of a program just by counting the number of instructions executed are long gone. -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Jonas Wielicki <jonas@wielicki.name> |
|---|---|
| Date | 2015-05-17 14:25 +0200 |
| Message-ID | <mailman.90.1431866130.17265.python-list@python.org> |
| In reply to | #90702 |
[Multipart message — attachments visible in raw view] — view raw
On 16.05.2015 02:55, Gregory Ewing wrote: > BartC wrote: >> For example, there is a /specific/ byte-code called BINARY_ADD, which >> then proceeds to call a /generic/ binary-op handler! This throws away >> the advantage of knowing at byte-code generation time exactly which >> operation is needed. > > While inlining the binary-op handling might give you a > slightly shorter code path, it wouldn't necessarily speed > anything up. It's possible, for example, that the shared > binary-op handler fits in the instruction cache, but the > various inlined copies of it don't, leading to a slowdown. > > The only way to be sure about things like that is to try > them and measure. The days when you could predict the speed > of a program just by counting the number of instructions > executed are long gone. That, and also, the days where you could guess the number of instructions executed from looking at the code are also gone. Compilers, and especially C or C++ compilers, are huge beasts with an insane number of different optimizations which yield pretty impressive results. Not to mention that they may know the architecture you’re targeting and can optimize each build for a different architecture; which is not really possible if you do optimizations which e.g. rely on cache characteristics or instruction timings or interactions by hand. I changed my habits to just trust my compiler a few years ago and have more readable code in exchange for that. The compiler does a fairly great job, although gcc still outruns clang for *my* usecases. YMMV. regards, jwi
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-05-17 14:41 +0100 |
| Message-ID | <YT06x.482260$_r1.466432@fx39.am4> |
| In reply to | #90768 |
On 17/05/2015 13:25, Jonas Wielicki wrote:
> On 16.05.2015 02:55, Gregory Ewing wrote:
>> BartC wrote:
>>> For example, there is a /specific/ byte-code called BINARY_ADD, which
>>> then proceeds to call a /generic/ binary-op handler! This throws away
>>> the advantage of knowing at byte-code generation time exactly which
>>> operation is needed.
>>
>> While inlining the binary-op handling might give you a
>> slightly shorter code path, it wouldn't necessarily speed
>> anything up. It's possible, for example, that the shared
>> binary-op handler fits in the instruction cache, but the
>> various inlined copies of it don't, leading to a slowdown.
>>
>> The only way to be sure about things like that is to try
>> them and measure. The days when you could predict the speed
>> of a program just by counting the number of instructions
>> executed are long gone.
>
> That, and also, the days where you could guess the number of
> instructions executed from looking at the code are also gone. Compilers,
> and especially C or C++ compilers, are huge beasts with an insane number
> of different optimizations which yield pretty impressive results. Not to
> mention that they may know the architecture you’re targeting and can
> optimize each build for a different architecture; which is not really
> possible if you do optimizations which e.g. rely on cache
> characteristics or instruction timings or interactions by hand.
>
> I changed my habits to just trust my compiler a few years ago and have
> more readable code in exchange for that. The compiler does a fairly
> great job, although gcc still outruns clang for *my* usecases.
> YMMV.
It does. For my interpreter projects, gcc -O3 does a pretty good job.
For running a suite of standard benchmarks ('spectral', 'fannkuch',
'binary-tree', all that lot) in the bytecode language under test, then
gcc is 30% faster than my own language/compiler. (And 25% faster than
clang.)
(In that project, gcc can do a lot of inlining, which doesn't seem to be
practical in CPython as functions are all over the place.)
However, when I plug in an ASM dispatcher to my version (which tries to
deal with simple bytecodes or some common object types before passing
control to the HLL to deal with), then I can get /twice as fast/ as gcc
-O3. (For real programs the difference is narrower, but usually still
faster than gcc.)
(This approach I don't think will work with CPython, because there don't
appear to be any simple cases for ASM to deal with! The ASM dispatcher
keeps essential globals such as the stack pointer and program counter in
registers, and uses chained 'threaded' code rather than function calls.
A proportion of byte-codes need to be handled in this environment,
otherwise it could actually slow things down, as the switch to/from HLL
code is expensive.)
--
Bartc
[toc] | [prev] | [standalone]
Page 3 of 3 — ← Prev page 1 2 [3]
Back to top | Article view | comp.lang.python
csiph-web