Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsreader4.netcologne.de!news.netcologne.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.009 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'algorithm': 0.03; 'true,': 0.04; 'modify': 0.05; 'bits': 0.07; '16-bit': 0.09; '32-bit': 0.09; 'compute': 0.09; "wouldn't": 0.11; 'producing': 0.15; '"small"': 0.16; '11:09': 0.16; '24,': 0.16; 'expert.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'hashes': 0.16; 'md5': 0.16; 'sure.': 0.16; 'wrote:': 0.17; 'byte': 0.17; 'bytes': 0.17; 'certainly': 0.17; 'thu,': 0.17; 'jan': 0.18; 'changes': 0.20; 'proposed': 0.20; 'large,': 0.22; 'seems': 0.23; 'random': 0.24; 'header:In-Reply-To:1': 0.25; 'url:wiki': 0.26; 'values': 0.26; 'am,': 0.27; 'change,': 0.27; 'message-id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'hash': 0.29; 'probability': 0.29; 'statements': 0.29; 'url:wikipedia': 0.29; 'definition': 0.29; 'function': 0.30; 'stuff': 0.30; '(and': 0.32; 'computing': 0.32; 'getting': 0.33; 'avoiding': 0.33; 'to:addr :python-list': 0.33; 'equal': 0.33; 'another': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'conditions.': 0.35; 'faster': 0.35; 'received:209.85.220': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'but': 0.36; 'url:org': 0.36; 'possible': 0.37; 'two': 0.37; 'being': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'url:en': 0.38; 'to:addr:python.org': 0.39; 'skip:" 10': 0.40; 'your': 0.60; 'easy': 0.60; 'skip:u 10': 0.60; 'chance': 0.61; "you'll": 0.62; 'different': 0.63; 'perfect': 0.63; 'more': 0.63; 'worldwide': 0.64; 'making': 0.64; 'secure.': 0.65; "today's": 0.66; 'subject: & ': 0.67; 'power': 0.74; '2013': 0.84; 'collision': 0.84; 'collision.': 0.84; 'cuts': 0.84; 'checks.': 0.91; 'cryptography': 0.91; 'angel': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=hiyR6erCKUUcZE58KzySVtH3WmzrHtoR5UrSUozHdqM=; b=KzRjymdT6qbEq70Zv5YSUJe+Y2xzE8OBZMlGZsStOoYeMre+7wilKMRs+P9adBRgtH +2RoV2U8OnKnSvGsFo0BCsHvxYxXFd+ffberCJOM8g4ZMbiCdPdmi3iRpPxz6o51SoJY SWVYnYN684ELQ4ky+grvNr2eEIRVsab5HOhJS3Mov5XQxlilKyBbs9XzYpm/6rroKLT5 dZYRekP4PbaRF1myyuWwD02Ya27UqDerEFdWdEY4wLeu4Xf1fhB9ZXaCvFnwBirI1hcb y2M8OaJzQXaffifSaOQKqggcz+WXEnc2+yqoompTQZ/a703JRwwrcAAOXVO8ruLPYCps IpgQ== MIME-Version: 1.0 X-Received: by 10.52.178.225 with SMTP id db1mr156882vdc.10.1358987956889; Wed, 23 Jan 2013 16:39:16 -0800 (PST) In-Reply-To: <51007BD0.1030105@davea.name> References: <8deb6f5d-ff10-4b36-bdd6-36f9eed58e1e@googlegroups.com> <5dd4babd-716d-4542-ad36-e6a841b73ec3@googlegroups.com> <03581a24-9330-4019-bde9-61a607000d3d@googlegroups.com> <187d77e0-e948-46bf-acc5-668c446cf3aa@googlegroups.com> <239abe33-fa5b-41a9-ae80-5260b9b1bd9c@googlegroups.com> <2391171e-e170-4647-8924-8e446ea1c6b1@googlegroups.com> <9d9b287c-ca2a-49c1-a16b-e42cb2a5db38@q16g2000pbt.googlegroups.com> <6f3e7d20-3005-4d1e-b949-d90a78e7bbf6@googlegroups.com> <50FFD9B8.3090304@davea.name> <51007BD0.1030105@davea.name> Date: Thu, 24 Jan 2013 11:39:16 +1100 Subject: Re: Uniquely identifying each & every html template From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 53 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1358987959 news.xs4all.nl 6873 [2001:888:2000:d::a6]:40650 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37523 On Thu, Jan 24, 2013 at 11:09 AM, Dave Angel wrote: > I certainly can't disagree that it's easy to produce a very long hash that > isn't at all secure. But I would disagree that longer hashes > *automatically* reduce chances of collision. Sure. But by and large, longer hashes give you a better chance at avoiding collisions. Caveat: I am not a cryptography expert. My statements are based on my own flawed understanding of what's going on. I use the stuff but I don't invent it. > Wikipedia - http://en.wikipedia.org/wiki/Cryptographic_hash_function > > seems to say that there are four requirements. > it is easy to compute the hash value for any given message > it is infeasible to generate a message that has a given hash > it is infeasible to modify a message without changing the hash > it is infeasible to find two different messages with the same hash > > Seems to me a small hash wouldn't be able to meet the last 3 conditions. True, but the definition of "small" is tricky. Of course the one-byte hash I proposed isn't going to be difficult to break, since you can just brute-force a bunch of message changes until you find one that has the right hash. But it's more about the cascade effect - that any given message has equal probability of having any of the possible hashes. Make a random change, get another random hash. So for a perfect one-byte hash, you have exactly one chance in 256 of getting any particular hash. By comparison, a simple/naive hash that just XORs together all the byte values fails these checks. Even if you take the message 64 bytes at a time (thus producing a 512-bit hash), you'll still be insecure, because it's easy to predict what hash you'll get after making a particular change. This property of the hash doesn't change as worldwide computing power improves. A hashing function might go from being "military-grade security" to being "home-grade security" to being "two-foot fence around your property", while still being impossible to predict without brute-forcing. But when an algorithm is found that generates collisions faster than the hash size indicates, it effectively reduces the hash size to the collision rate - MD5 is 128-bit, but (if I understand the Wikipedia note correctly) a known attack cuts that to 20.96 bits of "real hash size". So MD5 is still better than a perfect 16-bit hash, but not as good as a perfect 32-bit hash. (And on today's hardware, that's not good enough.) http://en.wikipedia.org/wiki/Collision_resistant ChrisA