Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'compiler': 0.05; 'cpython': 0.05; 'character,': 0.07; 'strings.': 0.07; 'scripts': 0.09; 'python': 0.09; 'pointers': 0.09; 'sep': 0.09; '"this': 0.13; 'language': 0.14; 'benjamin': 0.16; 'equal.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'not;': 0.16; 'python),': 0.16; 'recognizing': 0.16; 'unequal': 0.16; 'string': 0.17; 'wrote:': 0.17; 'differ': 0.17; 'instance,': 0.17; 'pointer': 0.17; 'otherwise,': 0.20; 'bit': 0.21; 'received:209.85.214.174': 0.21; 'idea': 0.24; 'header:In-Reply- To:1': 0.25; 'am,': 0.27; '(as': 0.27; '(such': 0.27; 'message- id:@mail.gmail.com': 0.27; 'decide': 0.28; 'character.': 0.29; 'comparison': 0.29; "d'aprano": 0.29; 'implies': 0.29; 'inspect': 0.29; 'optional': 0.29; 'steven': 0.29; 'probably': 0.29; 'this.': 0.29; 'could': 0.32; '11,': 0.33; 'to:addr:python-list': 0.33; 'likely': 0.33; 'that,': 0.34; 'version': 0.34; 'received:google.com': 0.34; 'subject:?': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'really': 0.36; 'but': 0.36; 'compare': 0.36; 'depends': 0.36; 'optimization': 0.37; 'does': 0.37; 'quite': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'received:209.85.214': 0.39; 'where': 0.40; 'header:Received:5': 0.40; 'most': 0.61; 'needing': 0.62; 'guaranteed': 0.76; 'gain': 0.79; 'oscar': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=7RmopRKwHxJv0t8DuoySYZFZ1slFwo7y1zwiXa2OLRg=; b=lhiOtPZ4z9V4ynlbDjBuGjO3G+R03musQYxU8dNGMm7lZBemr83ST4s+mxRLA5z5Us 0shlg+PYVmlXAtE6FKf8Oh7hvXqAjBPBxg7auSP1Ssx/KNCXdzq3yywlMBYyTriI6y36 Q0ooOWalAMGdIR7Y0aNhl71oBj7E84e/67W9wJTrF/MZwrr3iQ8f3zlbc18j+qUA5SOb 5c8ArK3RRpP1S+9pDcMGtxTrio8oIGKiMDKJD9ZltpIksMv4dFJdqfKi+5Ye+mxzHwUa +3/h4k0TAQ40PQTEBB0Xch8DfU9Jv/dft/M77T0MLESHJc4Fw6yfATQH6FZOsEjJajFw e78g== MIME-Version: 1.0 In-Reply-To: References: <504564ba$0$29978$c3e8da3$5496439d@news.astraweb.com> <504761ef$0$29981$c3e8da3$5496439d@news.astraweb.com> <50477cbb$0$29981$c3e8da3$5496439d@news.astraweb.com> <50485fca$0$29977$c3e8da3$5496439d@news.astraweb.com> <504972d1$0$29981$c3e8da3$5496439d@news.astraweb.com> <504deedc$0$29981$c3e8da3$5496439d@news.astraweb.com> Date: Tue, 11 Sep 2012 00:26:29 +1000 Subject: Re: Comparing strings from the back? From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 30 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1347287192 news.xs4all.nl 6858 [2001:888:2000:d::a6]:55535 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28821 On Tue, Sep 11, 2012 at 12:06 AM, Oscar Benjamin wrote: > On 2012-09-10, Steven D'Aprano wrote: >> What interning buys you is that "s == t" is an O(1) pointer compare if >> they are equal. But if s and t differ in the last character, __eq__ will >> still inspect every character. There is no way to tell Python "all >> strings are interned, if s is not t then s != t as well". >> > > I thought that if *both* strings were interned then a pointer comparison could > decide if they were unequal without needing to check the characters. > > Have I misunderstood how intern() works? In a language where _all_ strings are guaranteed to be interned (such as Lua, I think), you do indeed gain this. Pointer inequality implies string inequality. But when interning is optional (as in Python), you cannot depend on that, unless there's some way of recognizing interned strings. Of course, that may indeed be the case; a simple bit flag "this string has been interned" would suffice, and if both strings are interned AND their pointers differ, THEN you can be sure the strings differ. I have no idea whether or not CPython version X.Y.Z does this. The value of such an optimization really depends on how likely strings are to be interned; for instance, if the compiler automatically interns all the names of builtins, this could be quite beneficial. Otherwise, probably not; most Python scripts don't bother interning anything. ChrisA