Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.022 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; '(at': 0.03; 'missed': 0.09; 'cc:addr:python-list': 0.10; 'thread': 0.11; 'cases': 0.15; '(just': 0.16; 'alphabet': 0.16; 'digits.': 0.16; 'equations': 0.16; 'hex': 0.16; 'infinity': 0.16; 'md5': 0.16; 'statistics.': 0.16; 'storm,': 0.16; 'why,': 0.16; 'string': 0.17; 'wrote:': 0.17; 'comparing': 0.17; 'pointed': 0.17; 'thanks,': 0.18; 'previously': 0.18; '>>>': 0.18; 'proposed': 0.20; 'posted': 0.22; 'cc:2**0': 0.23; 'somebody': 0.23; 'random': 0.24; 'cc:no real name:2**0': 0.24; 'least': 0.25; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'looks': 0.26; 'am,': 0.27; 'i.e.': 0.27; 'chris': 0.28; 'comparison': 0.29; 'equivalent.': 0.29; 'gather': 0.29; 'measure': 0.29; 'character': 0.29; "i'm": 0.29; 'usually': 0.30; "aren't": 0.33; 'joined': 0.33; 'point.': 0.33; "can't": 0.34; 'subject:?': 0.35; 'really': 0.36; 'but': 0.36; 'totally': 0.36; "wasn't": 0.36; 'beyond': 0.37; 'subject:: ': 0.38; 'mean': 0.38; 'sure': 0.38; 'received:192': 0.39; 'received:192.168': 0.40; 'think': 0.40; 'your': 0.60; 'lost': 0.60; 'first': 0.61; 'limit': 0.65; 'therefore': 0.65; 'real-world': 0.65; 'study': 0.66; 'sum': 0.66; 'header:Reply-To:1': 0.68; 'internet': 0.71; 'received:74.208': 0.71; 'reply-to:no real name:2**0': 0.72; 'different.': 0.84; 'random,': 0.84; 'received:74.208.4.194': 0.84; 'shrinking': 0.84; 'visually': 0.84; 'angel': 0.93; 'average': 0.93 Date: Thu, 06 Sep 2012 11:54:58 -0400 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Johannes Bauer Subject: Re: Comparing strings from the back? References: <504564ba$0$29978$c3e8da3$5496439d@news.astraweb.com> <504761ef$0$29981$c3e8da3$5496439d@news.astraweb.com> <50477cbb$0$29981$c3e8da3$5496439d@news.astraweb.com> <50485fca$0$29977$c3e8da3$5496439d@news.astraweb.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:nmPXUF/CFZn476cUpmLp/vUz9cPqwPce8AIM467GqYE SE/pmnXSOhsj9KjBVMey+4LuoOK6rkHTvKVDRlfbEKL8LRuA5Q Z71KYWSez6ZtyP5etI4ucSrbywfQwzKh8fmlmpkCIUx68Cfvu3 w3P6HPBTtf/fZuYgtbI/y7EAoTAQNPqbTV2G5pHD9b6T2J6eM6 ZgzSiRH/TWSB92lwzc6FbddtN1XNInpz35uqxEunSorIZk9sQ4 xCdBCCK33fkqaWCvpsYBgaZ3tEikcU8QZCBYUYSh2+w2Kz/97f Sb+0jb00yPhAhOKLt9dwFNkJc3QBp5dXpI7vjukWLcJ8FZaSg= = Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: d@davea.name List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 48 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346946924 news.xs4all.nl 6955 [2001:888:2000:d::a6]:44009 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28615 On 09/06/2012 10:42 AM, Johannes Bauer wrote: > On 06.09.2012 16:23, Dave Angel wrote: >> On 09/06/2012 09:43 AM, Johannes Bauer wrote: >>> >>> Yes, worst-case is O(N), best case O(1). Average is O(n log n). >> Can't see how you came up with an average of n log(n). Fourteen minutes >> before you made this post, you demonstrated it was less than 2 for any N. >> >> And I previously posted that for a character set of size 10, the average >> was 1.1111 > Again playing with the equations and thinking about it again. And I > completely missed your point. It wasn't about n log n vs. log n. Both > are wrong. This surprises me. I was somehow thinking about the limit of > sum (1/n), n -> infty. But since the summands are shrinking > exponentially, the limit is different. > > I think the limit of the average comparisons for a given wordlength n > against infinity with alphabet size l is > > l / (l - 1) > > i.e. for bitstrings it's 2 and for bytestrings it's 256/255. > > This would mean string comparison of randomized strings is O(1). Can > that really be true? It looks like it. > (Just lost the internet in a storm, so I'm not sure how long it'll be before this sends.) Thanks, that was exactly my point. Since el is at least 2, the average number of comparisons is no larger than 2, for any value of N. That's why, when I'm visually comparing MD5 values, I usually only look at the first few, and last few hex digits. However, Chris Angelico (at 10:39) pointed out again that totally random strings aren't real-world equivalent. He has now proposed that somebody measure real-world cases here, by patching the string comparison to gather statistics. I think that's beyond me at this point. I only joined this thread when the cases under study were well-defined random, and therefore predictable. -- DaveA