Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.020 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'output': 0.05; 'assuming': 0.09; 'bits': 0.09; 'character,': 0.09; 'subject:script': 0.09; 'random': 0.14; 'base64': 0.16; 'does,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'it),': 0.16; 'multiples': 0.16; 'subject:random': 0.16; 'wrote:': 0.18; 'module': 0.19; 'thu,': 0.19; 'bytes': 0.24; "haven't": 0.24; 'header:In-Reply- To:1': 0.27; 'characters': 0.30; 'originally': 0.30; 'message- id:@mail.gmail.com': 0.30; 'that.': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'probably': 0.32; 'received:209.85': 0.35; 'equal': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'turn': 0.37; 'received:209': 0.37; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'little': 0.38; 'to:addr:python.org': 0.39; 'full': 0.61; 'chance': 0.65; 'details': 0.65; 'close': 0.67; 'introduce': 0.78; 'characters,': 0.84; 'subject:long': 0.84; 'subject:very': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=38k2qmFa9dsBsvbLqUq4vggX8p7O7M19Z0rcB8lfjHk=; b=ljrC4CFK0kTg3+AVkpZh/jptxCK+Kvg3QkJzqdfvjklVP18D1YLdGXN8D3jVwYNKVZ vlQQF5SyWmBqHA/dFKfgsCQAR4nhwqCLydu84CPBecLUHJsib5LtdVTlgtHm2ytigGil 8pgkcu5ZIcaJSd7jPQcgrGtdxLfvSUYvV/PMvr4HGXDQ4ZGUOCwryHU9KBUC9u5m1atP cZ5iBAy9/u6FTm4AJVvygDWhYcz5RDEnlwM/QyQfFD+WeBnS9ND88M+iiPNchxJSf16z mC+51AhJq9KorZQgF1CgzqFHs0y06BuNdXygq+BZucvct/f8u/Qa0Zf2g8A3uurAzSpT zKVQ== MIME-Version: 1.0 X-Received: by 10.58.75.46 with SMTP id z14mr3861357vev.52.1365659610275; Wed, 10 Apr 2013 22:53:30 -0700 (PDT) In-Reply-To: <51664b43$0$29977$c3e8da3$5496439d@news.astraweb.com> References: <24dc619b-7abd-4be3-aa92-f858eb4ab85f@n4g2000yqj.googlegroups.com> <51664b43$0$29977$c3e8da3$5496439d@news.astraweb.com> Date: Thu, 11 Apr 2013 15:53:30 +1000 Subject: Re: performance of script to write very long lines of random chars From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 16 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1365659618 news.xs4all.nl 2648 [2001:888:2000:d::a6]:48142 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:43320 On Thu, Apr 11, 2013 at 3:33 PM, Steven D'Aprano wrote: > I was originally going to write that using the base64 module would > introduce bias into the random strings, but after a little investigation, > I don't think it does. Assuming that os.urandom() returns bytes with perfectly fair distribution (exactly equal chance of any value 00-FF - it probably does, or close to it), and assuming that you work with exact multiples of 3 bytes and 4 output characters, base64 will give you perfectly fair distribution of result characters. You take three bytes (24 bits) and turn them into four characters (6 bits per character, = 24 bits). You might see some bias if you use less than a full set of four output characters, though; I haven't dug into the details to check that. ChrisA