Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'explicitly': 0.04; 'case.': 0.05; 'character,': 0.07; 'keys,': 0.07; 'python': 0.09; 'booth': 0.09; 'cached': 0.09; 'contexts': 0.09; 'dict': 0.09; 'lookup': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'release.': 0.09; 'terry': 0.09; 'attributes.': 0.16; 'benjamin': 0.16; 'equal.': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'unequal': 0.16; 'string': 0.17; 'wrote:': 0.17; 'differ': 0.17; 'pointer': 0.17; 'jan': 0.18; 'all,': 0.21; 'work,': 0.22; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'looks': 0.26; 'values': 0.26; 'am,': 0.27; 'guess': 0.27; 'set.': 0.27; "doesn't": 0.28; 'header:X-Complaints-To:1': 0.28; 'decide': 0.28; 'character.': 0.29; 'comparison': 0.29; 'hash': 0.29; 'inspect': 0.29; 'starts': 0.29; 'code': 0.31; 'could': 0.32; 'to:addr:python-list': 0.33; 'equal': 0.33; 'generic': 0.35; 'identity': 0.35; 'especially': 0.35; 'subject:?': 0.35; 'there': 0.35; 'received:org': 0.36; 'but': 0.36; 'compare': 0.36; 'possible': 0.37; 'does': 0.37; 'far': 0.37; 'subject:: ': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'release': 0.39; 'short': 0.39; 'where': 0.40; 'header:Received:5': 0.40; 'think': 0.40; "you've": 0.61; 'needing': 0.62; 'believe': 0.69; 'circuit': 0.84; 'oscar': 0.84; 'received:fios.verizon.net': 0.84; 'conflicts,': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Terry Reedy Subject: Re: Comparing strings from the back? Date: Tue, 11 Sep 2012 11:55:55 -0400 References: <504564ba$0$29978$c3e8da3$5496439d@news.astraweb.com> <504761ef$0$29981$c3e8da3$5496439d@news.astraweb.com> <50477cbb$0$29981$c3e8da3$5496439d@news.astraweb.com> <50485fca$0$29977$c3e8da3$5496439d@news.astraweb.com> <504972d1$0$29981$c3e8da3$5496439d@news.astraweb.com> <504deedc$0$29981$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: pool-173-75-251-66.phlapa.fios.verizon.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 41 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1347378993 news.xs4all.nl 6906 [2001:888:2000:d::a6]:34750 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28888 On 9/11/2012 6:40 AM, Oscar Benjamin wrote: > On 11 September 2012 10:51, Duncan Booth > wrote: > > Oscar Benjamin > wrote: > > >> What interning buys you is that "s == t" is an O(1) pointer compare > >> if they are equal. But if s and t differ in the last character, > >> __eq__ will still inspect every character. There is no way to tell > >> Python "all strings are interned, if s is not t then s != t as > well". > >> > > > > I thought that if *both* strings were interned then a pointer > > comparison could decide if they were unequal without needing to check > > the characters. > > > > Have I misunderstood how intern() works? > > > > I don't think you've misunderstood how it work, but so far as I can > see the > code doesn't attempt to short circuit the "not equal but interned" case. > The comparison code doesn't look at interning at all, it only looks for > identity as a shortcut. > > > It also doesn't seem to check if the hash values have been set. I guess > the cached hash value is only used in contexts where the hash is > explicitly desired.- I believe the internal use of interning and hash comparison has varied from release to release. However, the main use of string comparison is for dict keys, especially the internal dicts for namespaces and attributes. Since the dict lookup code needs hash values anyway, to find slots with possible conflicts, I am sure it does not use the generic comparison operators but starts with hash comparisons. Terry Jan Reedy