Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: How to waste computer memory? Date: Sun, 20 Mar 2016 23:27:58 +1100 Lines: 19 Message-ID: References: <265377f4-741d-4aa2-9338-239f56f8bc57@googlegroups.com> <87twk3oli0.fsf@elektro.pacujo.net> <87k2kzo5y5.fsf@elektro.pacujo.net> <56ed0a71$0$1607$c3e8da3$5496439d@news.astraweb.com> <87lh5en79a.fsf@elektro.pacujo.net> <56ed68bb$0$1604$c3e8da3$5496439d@news.astraweb.com> <877fgylddm.fsf@elektro.pacujo.net> <56ed749e$0$1583$c3e8da3$5496439d@news.astraweb.com> <8737rmla4w.fsf@elektro.pacujo.net> <56ee2ebd$0$1597$c3e8da3$5496439d@news.astraweb.com> <12db8cba-8edf-4cd0-a91d-2f6b6634c9d3@googlegroups.com> <56ee8454$0$22142$c3e8da3$5496439d@news.astraweb.com> <56ee9431$0$1620$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de oHPPIEVJWf9y4beUaGlpYgfntjViPE6Il6eIbTQmdY0w== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'memory.': 0.05; 'bytes.': 0.07; 'utf-8': 0.07; 'cc:addr:python-list': 0.09; 'subject:How': 0.09; '"a"': 0.09; '16-bit': 0.09; 'output': 0.13; 'encoding': 0.15; 'explicitly': 0.15; '2016': 0.16; '8-bit': 0.16; 'blame': 0.16; 'chip': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'reversing': 0.16; 'wrote:': 0.16; 'byte': 0.18; 'bytes': 0.18; 'instance,': 0.18; 'intel': 0.18; '>>>': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'words': 0.24; 'header:In-Reply-To:1': 0.24; 'message-id:@mail.gmail.com': 0.27; 'specifically': 0.28; '32-bit': 0.29; 'received:209.85.213.174': 0.29; 'that.': 0.30; 'another': 0.32; "can't": 0.32; 'aside': 0.32; 'maybe': 0.33; "d'aprano": 0.33; 'steven': 0.33; 'that,': 0.34; 'received:google.com': 0.35; 'could': 0.35; "isn't": 0.35; 'but': 0.36; 'there': 0.36; 'received:209.85': 0.36; 'subject:?': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'received:209.85.213': 0.37; 'doing': 0.38; 'no,': 0.38; 'received:209': 0.38; 'anything': 0.38; 'enough': 0.39; 'still': 0.40; 'mar': 0.65; 'believe': 0.66; '20,': 0.66; 'obvious': 0.76; 'actually,': 0.84; 'chrisa': 0.84; 'inherent': 0.84; 'to:none': 0.91; 'notion': 0.91; 'hand,': 0.97 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=iMPvklkDZJJwRoDP3QWmDjR3Q20nVrPCcyQ28IoEPS4=; b=umnOcb9tiSE3XxbBPEiZJUQ3vbXILPICtnTNnm6E95w2L2qUyPODTlg2F8iKqXMKGu BrDmfSp35L91A7tgkAuSZJ1IDt5U+NZa6GPr9PlmkuIV/V0xHFqhvRNqhEaPEQYTSmgX BFLFxYIPbHWbhU3AKx095cxnXNt1m+9FPEHCbFNxcMWaFv2oZO3Au8OMYzCxvOVOJ3ts gr+E+z3eMM9JZtdtLuGwpTMh9VTiuiPP9IAEg4imIcqEZOfWOsIM7wiDDu9TLPCqzoKf bHyGORgNt9MawSvkR6UJBqh2ZQgiH+Z+AqXRdSbVuWgyPF4JCb0VevniXRzbTcOJGPgs bfiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=iMPvklkDZJJwRoDP3QWmDjR3Q20nVrPCcyQ28IoEPS4=; b=FSCY9g5CxFPIlX884F3nxA3bx+MszU7aD1ScwB4T7Tg6EmHToLwznoQNyWiPUVhzLs v64DBjM+bY4RL3YJywh+Up5JCJ6kNH0uGnJJCsJ08iMw9u9lIc98dSx8ESiCcHFqFUYF eCCAniD9AzrdX6+gq3kfCfcaTVNBjoezIx1evZOCIDj8SvtfF73KVF6QjCWuiekbImIQ /zHWImPHLbjU7A3qWF49yfOoBZB/vlmlkG9SJoE4AV33xSOKkVO1riL409uoy/QA6k1B LNO+4JjoFt4acNOHjbWlr8hOF+kC1X2V/WICYqx6JSe+cRXZx4FBUAE6lRXBcNtgnygX GvYA== X-Gm-Message-State: AD7BkJKljh84fUUF/9950MMswuhZV+GPbYOsTI2y/K0AB4pfc4k3vT9ThyBvHlOWSIAFM6KuMeCq/0K4KnPynw== X-Received: by 10.50.138.233 with SMTP id qt9mr7636306igb.13.1458476878479; Sun, 20 Mar 2016 05:27:58 -0700 (PDT) In-Reply-To: <56ee9431$0$1620$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:105302 On Sun, Mar 20, 2016 at 11:14 PM, Steven D'Aprano wrote: >>> On the other hand, I believe that the output of the UTF transformations >>> is explicitly described in terms of 8-bit bytes and 16- or 32-bit words. >>> For instance, the UTF-8 encoding of "A" has to be a single byte with >>> value 0x41 (decimal 65). It isn't that this is the most obvious >>> implementation, its that it can't be anything else and still be UTF-8. >> >> Exactly. Aside from the way UTF-16 and UTF-32 have LE and BE variants, > > Blame the chip manufacturers for that. Actually, I think we can blame Intel > specifically for that, for reversing the normal layout of words in memory. No, I disagree; it's inherent in the notion of representing a 16-bit or 32-bit value across bytes. Maybe there could have been one most-common standard, but there'd still have been another way of doing it. Little-endianness and big-endianness are important enough to have to deal with. ChrisA