Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.031 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'python.': 0.02; 'subject:method': 0.09; 'subject:number': 0.09; 'subject:string': 0.09; 'subject:using': 0.09; 'buggy': 0.16; 'coding?': 0.16; 'elsewhere,': 0.16; 'non-trivial': 0.16; 'subject:Converting': 0.16; 'wrote:': 0.17; 'code.': 0.20; 'posted': 0.22; 'header:In- Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'implemented': 0.27; 'opposed': 0.27; 'hash': 0.29; 'perl': 0.29; 'case,': 0.29; 'probably': 0.29; 'code': 0.31; 'to:addr:python-list': 0.33; 'pm,': 0.35; 'explain': 0.36; "didn't": 0.36; 'subject:: ': 0.38; 'files': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'received:192.168': 0.40; 'chance': 0.61; '10000': 0.65; 'differences': 0.65; 'pin': 0.65; 'received:74.208': 0.71; '100': 0.78; '220': 0.84; 'collision': 0.84; 'collision.': 0.84; '40%': 0.91 Date: Tue, 22 Jan 2013 14:08:14 -0500 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Converting a string to a number by using INT (no hash method) References: <4339f8d7-2d78-450f-ad0e-91da35615e6d@googlegroups.com> <2de57cf7-4a8f-4304-91cf-0024963315d7@googlegroups.com> In-Reply-To: <2de57cf7-4a8f-4304-91cf-0024963315d7@googlegroups.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:wTKeD7eUv4XB9TrYw5nuKRTpln0TepSAYfy5vGA1Nen GPTmLbcKALgb7Ex/OESVfY9CyJ+eIGj8mqeGtKQE0ImdyS08+Y XrSXAffbkcNxiuOGiIRLCfqbY59iQAU7by8Vf3lRmeXdAAf9oV Z994BSRA72ygt1hSoVADisAlY8X+kJrmKA/JLecbKFJ13ihDCH SJzUuAMisNKsy9XC+Y2o/QMaMLZkIg9QmMPWWz8lpQ4F5PFzqu aqbgFLPqasQwo1UeaVVUD/lrpnY1mWiI7rePHKyBOKKIIDm7IC TOh22JE3KSoTi/fOGZZuEDLKI1fyBhvc22ij078oZT3k18IZA= = X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 31 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1358881715 news.xs4all.nl 6854 [2001:888:2000:d::a6]:47668 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37336 On 01/22/2013 01:37 PM, Ferrous Cranus wrote: > >> >> > > ============================================== > pin = int( htmlpage.encode("hex"), 16 ) % 10000 > ============================================== > > Can you please explain the differences to what you have posted opposed to this perl coding? > > ============================================== > foreach my $ltr(@ltrs){ > $hash = ( $hash + ord($ltr)) %10000; > ============================================== > > I want to understand this and see it implemented in Python. > The perl code will produce the same hash for "abc.html" as for "bca.html" That's probably one reason Leonard didn't try to transliterate the buggy code. In any case, the likelihood of a hash collision for any non-trivial website is substantial. As I said elsewhere, if you hash 100 files you have about a 40% chance of a collision. If you hash 220 files, the likelihood is about 90% -- DaveA