Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: Pyhon 2.x or 3.x, which is faster? Date: Fri, 11 Mar 2016 07:07:09 +1100 Lines: 39 Message-ID: References: <56df6761$0$1588$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: news.uni-berlin.de b0BLfyMsV5AuASm3ernVOwXvHskv/oLNYju7E3+obCVw== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'character,': 0.07; 'interpreted': 0.07; 'width': 0.07; 'cc:addr:python-list': 0.09; 'pixels': 0.09; 'subject:which': 0.09; 'width.': 0.09; 'bug': 0.10; 'assume': 0.11; 'language,': 0.11; 'ignore': 0.14; 'encoding': 0.15; '(eg.': 0.16; '(well,': 0.16; '2016': 0.16; 'agree.': 0.16; 'definition.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'intrinsic': 0.16; 'measuring': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'width,': 0.16; 'workspace': 0.16; 'wrote:': 0.16; 'later': 0.16; 'byte': 0.18; 'bytes': 0.18; 'variable': 0.18; 'thanks.': 0.18; 'all,': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'do.': 0.22; 'ascii': 0.22; 'recognize': 0.22; 'text,': 0.22; 'txt': 0.22; 'am,': 0.23; 'leave': 0.23; 'wrote': 0.23; 'changes,': 0.23; 'header:In-Reply-To:1': 0.24; 'rest': 0.26; 'distribute': 0.27; 'figure': 0.27; 'fri,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'format,': 0.27; 'this.': 0.28; 'values': 0.28; 'preceding': 0.29; 'received:209.85.213.174': 0.29; 'ago': 0.29; 'character': 0.29; 'certain': 0.31; "can't": 0.32; 'problem': 0.33; 'stream': 0.33; '(for': 0.34; 'file': 0.34; 'received:google.com': 0.35; 'world,': 0.35; 'text': 0.35; 'done': 0.35; 'clear': 0.35; 'unicode': 0.35; 'step': 0.36; 'but': 0.36; 'should': 0.36; 'lines': 0.36; 'received:209.85': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'being': 0.37; 'client': 0.37; 'beyond': 0.37; 'received:209.85.213': 0.37; 'support,': 0.37; 'difference': 0.38; 'received:209': 0.38; 'anything': 0.38; 'means': 0.39; 'data': 0.39; 'sure': 0.39; 'rather': 0.39; 'still': 0.40; 'space': 0.40; 'some': 0.40; 'your': 0.60; 'skip:u 10': 0.61; 'personal': 0.63; 'between': 0.65; 'mar': 0.65; 'series': 0.65; 'combining': 0.66; 'biggest': 0.67; 'intelligent': 0.76; 'chrisa': 0.84; 'etc,': 0.84; 'refuses': 0.84; 'to:none': 0.91; 'step.': 0.91; 'divided': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-transfer-encoding; bh=RmRtGZBuKy9XefK+NsA0NogEjemXwE7PPHsPbXXmA4w=; b=V6Jctb4iEW7tuks8J/8OmQsNLI2Lgy2Sg7RIx4xTlQlMJRit1uKuRVcXv3jAT+WkCs QSTi5KNkKrLE7/B9B3qrObBYyLVym4e4CpfKTOem39914+3pwpgEJZYKrCrqIi/8NArr 9qj29QwB6AhYrB65YthgqT9BG5hNEbOZw9k7do3wIjDAfxfkWz3nMH+qpAxY+6vcEnKj 4erv9AHxhLSKBU/YZQGhbpq12a5ZKZ63XX/MoQDPeCiPrJ8CaWH69CxlpYatFzwNc3pF aRUWTweytfXqGrYioJ4DcSNgTXi9IrZc+6FIw0qF2A8rrckL4sNcYtW/jQsdVfIUdVGu iUUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc:content-transfer-encoding; bh=RmRtGZBuKy9XefK+NsA0NogEjemXwE7PPHsPbXXmA4w=; b=dkXGJ++CBMSEsTV6UcBXR5VDkHIqOVNsz5RLx5jG+X3wF1T8/YFcTgSDZd5QT4dlD7 4Rxxb82ATJcmKdwrz/7mggPknSfNOJZ/YwrmjcAieUzKW+tb1T9M/yia2Ld5hEJY4bq9 DpEnJjcf1IXNhdmhPIh5EwVmcVeGkwOSWLLbnF7HhBwCdLIY0XREVEZlu2tSO9d7W5w3 /RpRSTGxFkFIwTZgFQ6L/aJMyeXbWLIRS/CAbkLHTAQss2qF1CGYtRsbO8YNnvObpRwF B4ouyZXE1I88rqvqpe21SlXMlSN639i/l6L8v7V2IQOuzcuW5RVdCk7wc2gXuq+in/9M 9Rrg== X-Gm-Message-State: AD7BkJIFDGY1xVPtTk511s3rujfVrogpr0mXRmsSF5iwloIHbqKfniArbDpt6mpwGAl4q5g3jbBN0r3eIad4ww== X-Received: by 10.50.137.35 with SMTP id qf3mr197064igb.92.1457640429405; Thu, 10 Mar 2016 12:07:09 -0800 (PST) In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:104549 On Fri, Mar 11, 2016 at 1:22 AM, BartC wrote: >> >> 1) Unicode support, intrinsic to the language, is crucial, even if >> BartC refuses to recognize this. Anything released beyond the confines >> of his personal workspace will need full Unicode support, otherwise it >> is a problem to the rest of the world, and should be destroyed with >> fire. Thanks. > > > I don't agree. If I distribute some text in the form of a series of ASCII > byte values (eg. classic TXT format, with either kind of line separator), > then that same data can be directly interpreted as UTF-8. What you call "classic TXT format" is still an encoding, which means you're acknowledging the difference between characters and bytes - that's the first step. But you have to be certain that you are interpreting it as UTF-8, in which case ASCII ceases to be significant, and what you've done is declare that your file consists of a stream of UTF-8-encoded Unicode characters, divided into lines with either U+000D U+000A or just U+000A. That's a nice clear encoding definition. And the difference between characters and bytes is only the first step (albeit the biggest and most important step). You _need_ to make sure that you're thinking about text as text, and that means being aware of RTL vs LTR, combining characters, case conversions, collations, etc, etc, etc, all in terms of Unicode rather than as eight-bit or seven-bit characters. (For example, a na=C3=AFve MUD client might assume that one byte is one character is 8 pixels of width. I know this, because some years ago I wrote one exactly like that (well, the figure "8" came from measuring the current font, but other than at font changes, it was fixed). An intelligent Unicode-aware MUD client has to not only cope with variable width, but also characters that don't have any width at all, and those that use the same space as their base character, and those that are placed to the left of the preceding character.) You can't ignore this, although you might be able to leave full support for later - but it's a bug until you do. ChrisA