Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python,': 0.02; 'from:addr:yahoo.co.uk': 0.04; 'languages.': 0.04; 'classes,': 0.05; 'root': 0.05; 'string.': 0.05; 'attribute': 0.07; 'compiler': 0.07; 'method.': 0.07; 'suppose': 0.07; 'string': 0.09; '(ie.': 0.09; 'function,': 0.09; 'high-level': 0.09; 'lawrence': 0.09; 'lookup': 0.09; 'method,': 0.09; 'method:': 0.09; 'pointers': 0.09; 'pretend': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'def': 0.12; 'language.': 0.14; "'b'": 0.16; '(it': 0.16; '(key,': 0.16; '*almost*': 0.16; '*args):': 0.16; 'blank,': 0.16; 'cached': 0.16; 'consulted': 0.16; 'delegating': 0.16; 'descriptor': 0.16; 'descriptors,': 0.16; 'dict': 0.16; 'dog': 0.16; 'dots': 0.16; 'expensive,': 0.16; 'hierarchy': 0.16; 'instance:': 0.16; 'instantiated': 0.16; 'lookups': 0.16; 'metaclasses': 0.16; 'mro': 0.16; "object's": 0.16; 'overridden': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'sequence:': 0.16; 'tables,': 0.16; 'temp': 0.16; 'think.': 0.16; 'type)': 0.16; 'index': 0.16; 'java,': 0.16; 'sat,': 0.16; 'do,': 0.16; 'ignore': 0.16; 'language': 0.16; 'wrote:': 0.18; 'looked': 0.18; 'basically': 0.19; "hasn't": 0.19; "python's": 0.19; 'resolved': 0.19; 'skip:f 30': 0.19; 'later': 0.20; 'not,': 0.20; '>>>': 0.22; 'programming': 0.22; 'header:User-Agent:1': 0.23; 'adds': 0.24; "aren't": 0.24; 'example.': 0.24; 'instance,': 0.24; 'string,': 0.24; 'typical': 0.24; 'java': 0.24; "haven't": 0.24; '(or': 0.24; 'sort': 0.25; 'compare': 0.26; 'performing': 0.26; 'possibly': 0.26; 'tables': 0.26; 'this:': 0.26; 'pass': 0.26; 'defined': 0.27; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'function': 0.29; 'feature': 0.29; 'resolution': 0.29; 'am,': 0.29; 'array': 0.29; 'points': 0.29; 'raise': 0.29; 'words': 0.29; 'matching': 0.30; "i'm": 0.30; 'gives': 0.31; 'easier': 0.31; 'usually': 0.31; '>>>>': 0.31; 'calculated': 0.31; 'constant': 0.31; "d'aprano": 0.31; 'fast.': 0.31; 'searches': 0.31; 'steven': 0.31; 'class': 0.32; 'critical': 0.32; 'languages': 0.32; 'up.': 0.33; 'call.': 0.33; 'position.': 0.33; 'skip:_ 10': 0.34; 'skip:d 20': 0.34; "can't": 0.35; 'classes': 0.35; 'created': 0.35; 'knows': 0.35; 'something': 0.35; 'finally': 0.65; 'charset:windows-1252': 0.65; 'close': 0.67; 'details,': 0.68; 'body.': 0.68; 'of:': 0.68; 'sound': 0.68; 'study': 0.69; 'construction': 0.72; 'delegates': 0.74; 'further,': 0.74; 'obvious': 0.74; '2015': 0.84; "class's": 0.84; 'dict,': 0.84; 'fast,': 0.84; "night's": 0.84; 'received:as9105.com': 0.84; 'received:dsl.as9105.com': 0.84; 'received:dynamic.dsl.as9105.com': 0.84; 'inefficient': 0.91; 'write:': 0.91; 'directly.': 0.95; 'picture': 0.97 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Mark Lawrence Subject: Re: Building CPython Date: Sat, 16 May 2015 03:17:27 +0100 References: <7JN4x.37133$Q41.15375@fx25.am4> <6w35x.645690$I97.19867@fx31.am4> <874mnfunpn.fsf@elektro.pacujo.net> <87bnhmgqrx.fsf@elektro.pacujo.net> <87twvdsbom.fsf@elektro.pacujo.net> <5556a3a5$0$12989$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: 80-44-148-0.dynamic.dsl.as9105.com User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 In-Reply-To: <5556a3a5$0$12989$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 170 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1431742804 news.xs4all.nl 2917 [2001:888:2000:d::a6]:35397 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:90713 On 16/05/2015 02:55, Steven D'Aprano wrote: > On Sat, 16 May 2015 09:27 am, Mark Lawrence wrote: > >> On 15/05/2015 23:44, Marko Rauhamaa wrote: >>> BartC : >>> >>>> What /is/ a method lookup? Is it when you have this: >>>> >>>> A.B() >>>> >>>> and need to find whether the expression A (or its class or type) has a >>>> name B associated with it? (And it then needs to check whether B is >>>> something that can be called.) >>>> >>>> If so, does that have to be done using Python's Dict mechanism? (Ie. >>>> searching for a key 'B' by name and seeing if the object associated >>>> with it is a method. That does not sound efficient.) > > It's not as inefficient as you may think. Dicts are hash tables, and hash > tables are a standard computer science data structure for performing very > fast searches at constant (or near constant) time. > > The dict is basically an array of pointers to (key, value). To look a name > up in the dict, you hash the string which gives you an index into the > array, then look at that position. If it is blank, you know there is no > match. If it points to a string, you compare that string to your string. If > they are equal, then you have a match. If they aren't equal, you have a > collision, and you have to look elsewhere (details differ) but typically > you don't end up looking in more than one or two positions. So all pretty > fast, and close enough to constant time. > > To speed things up even further, I think that the hash value is cached with > the string, so it only needs to be calculated the first time. > > > >>> That is a general feature among high-level programming languages. In >>> Python, it is even more complicated: >>> >>> * first the object's dict is looked up for the method name >>> >>> * if the method is not found (it usually isn't), the dict of the >>> object's class is consulted >>> >>> * if the method is found (it usually is), a function object is >>> instantiated that delegates to the class's method and embeds a "self" >>> reference to the object to the call > > It's the other way around. The function object already exists: you created > it by writing `def method(self, *args): ... ` inside the class body. def > always makes a function. It's the *method* object which is created on the > fly, delegating to the function. > > > >>> IOW, two dict lookups plus an object construction for each method call. >>> >>> >>> Marko >>> >> >> As a picture paints a thousand words is anybody aware of a site or sites >> that show this diagramatically, as I think I and possibly others would >> find it far easier to grasp. > > No I'm not aware of any such site, but I can try to make it more obvious > with an example. > > Suppose we have a hierarchy of classes, starting from the root of the > hierarchy (object) to a specific instance: > > class Animal(object): > pass > > class Mammal(Animal): > pass > > class Dog(Mammal): > def bark(self): ... > > laddy = Dog() > > > We then look up a method: > > laddy.bark() > > In a non-dynamic language like Java, the compiler knows exactly where bark > is defined ("in the Dog class") and can call it directly. In dynamic > languages like Python, the compiler can't be sure that bark hasn't been > shadowed or overridden at runtime, so it has to search for the first match > found. Simplified: > > * Does laddy.__dict__ contain the key "bark"? If so, we have a match. > > * For each class in the MRO (Method Resolution Order), namely > [Dog, Mammal, Animal, object], does the class __dict__ contain the > key "bark"? If so, we have a match. > > * Do any of those classes in the MRO have a __getattr__ method? If > so, then try calling __getattr__("bark") and see if it returns > a match. > > * Give up and raise AttributeError. > > [Aside: some of the things I haven't covered: __slots__, __getattribute__, > how metaclasses can affect this.] > > In the case of laddy.bark, the matching attribute is found as > Dog.__dict__['bark']: > > temp = Dog.__bark__['bark'] # laddy.bark, is a function > > At this point, the descriptor protocol applies. You can ignore this part if > you like, and just pretend that laddy.bark returns a method instead of a > function, but if you want to know what actually happens in all it's gory > details, it is something like this (again, simplified): > > * Does the attribute have a __get__ method? If not, then we just > use the object as-is, with no changes. > > * But if it does have a __get__ method, then it is a descriptor and > we call the __get__ method to get the object we actually use. > > > Since functions are descriptors, we get this: > > temp = temp.__get__(laddy, Dog) # returns a method object > > and finally we can call the method: > > temp() # laddy.bark() > > > None of these individual operations are particularly expensive, nor are > there a lot of them. For a typical instance, the MRO usually contains only > two or three classes, and __dict__ lookups are fast. Nevertheless, even > though each method lookup is individually *almost* as fast as the sort of > pre-compiled all-but-instantaneous access which Java can do, it all adds > up. So in Java, a long chain of dots: > > foo.bar.baz.foobar.spam.eggs.cheese > > can be resolved at compile-time, and takes no more time than > > foo.cheese > > but in Python's case it has to be resolved at run-time, so if you care about > speed, you should try to avoid long chains of dots in performance critical > loops. E.g. instead of: > > for x in sequence: > foo.bar.baz.foobar.spam.eggs.cheese(x) > > you can write: > > cheese = foo.bar.baz.foobar.spam.eggs.cheese > for x in sequence: > cheese(x) > Thanks for taking the time over this, I'll study it later today after hopefully a good night's sleep :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence