Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #105299

Re: How to waste computer memory?

Path csiph.com!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Subject Re: How to waste computer memory?
Date Sun, 20 Mar 2016 22:22:45 +1100
Lines 42
Message-ID <mailman.404.1458472974.12893.python-list@python.org> (permalink)
References <a2639027-c69c-46df-a7a5-45a677b9e01d@googlegroups.com> <265377f4-741d-4aa2-9338-239f56f8bc57@googlegroups.com> <mailman.302.1458284448.12893.python-list@python.org> <lf5y49gw5s9.fsf@ling.helsinki.fi> <mailman.327.1458313179.12893.python-list@python.org> <87twk3oli0.fsf@elektro.pacujo.net> <mailman.351.1458332168.12893.python-list@python.org> <87k2kzo5y5.fsf@elektro.pacujo.net> <mailman.353.1458335305.12893.python-list@python.org> <56ed0a71$0$1607$c3e8da3$5496439d@news.astraweb.com> <87lh5en79a.fsf@elektro.pacujo.net> <56ed68bb$0$1604$c3e8da3$5496439d@news.astraweb.com> <877fgylddm.fsf@elektro.pacujo.net> <56ed749e$0$1583$c3e8da3$5496439d@news.astraweb.com> <8737rmla4w.fsf@elektro.pacujo.net> <56ee2ebd$0$1597$c3e8da3$5496439d@news.astraweb.com> <12db8cba-8edf-4cd0-a91d-2f6b6634c9d3@googlegroups.com> <56ee8454$0$22142$c3e8da3$5496439d@news.astraweb.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace news.uni-berlin.de dtVwgmvXEWo180pYvjmIQQxBQhibomlFUOb8sl2M/efg==
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'essentially': 0.04; 'layers': 0.05; '(so': 0.07; 'bytes.': 0.07; 'defines': 0.07; 'utf-8': 0.07; 'cc:addr:python-list': 0.09; 'subject:How': 0.09; '"a"': 0.09; 'definition,': 0.09; 'encode': 0.09; 'integers': 0.09; 'output': 0.13; 'encoding': 0.15; 'explicitly': 0.15; '2016': 0.16; '8-bit': 0.16; 'accident.': 0.16; 'choose,': 0.16; 'encodings,': 0.16; 'forth.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'integers,': 0.16; 'integers.': 0.16; 'lexical': 0.16; 'magic': 0.16; 'numbered': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'sign,': 0.16; 'sorting': 0.16; 'wrote:': 0.16; 'memory': 0.17; 'byte': 0.18; 'bytes': 0.18; 'instance,': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'not,': 0.22; 'defined': 0.23; 'represents': 0.23; 'specified': 0.23; 'unlike': 0.23; 'header:In- Reply-To:1': 0.24; "doesn't": 0.26; 'points': 0.27; 'checking': 0.27; 'equivalent': 0.27; 'order.': 0.27; 'message- id:@mail.gmail.com': 0.27; 'sequence': 0.27; '32-bit': 0.29; 'equality': 0.29; 'character': 0.29; 'code': 0.30; 'compared': 0.30; "can't": 0.32; '[1]': 0.32; 'aside': 0.32; 'point': 0.33; "d'aprano": 0.33; 'steven': 0.33; 'stream': 0.33; 'received:google.com': 0.35; 'mapping': 0.35; 'unicode': 0.35; 'something': 0.35; "isn't": 0.35; 'but': 0.36; 'there': 0.36; 'received:209.85': 0.36; 'subject:?': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'things': 0.38; "won't": 0.38; 'received:209': 0.38; 'anything': 0.38; 'someone': 0.38; 'mean': 0.38; 'represent': 0.38; 'means': 0.39; 'does': 0.39; 'where': 0.40; 'still': 0.40; 'some': 0.40; 'field': 0.60; 'care': 0.60; 'your': 0.60; 'mar': 0.65; 'believe': 0.66; '20,': 0.66; 'else.': 0.66; 'legal': 0.66; 'fact,': 0.67; 'choose': 0.68; 'obvious': 0.76; 'chrisa': 0.84; 'forbidden.': 0.84; 'one).': 0.84; 'points,': 0.84; 'to:none': 0.91; '"one': 0.91; 'hand,': 0.97
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=jb5JhUiVJs66j8Dd98kXlKswrGtnT5zBqC4qP/duaTE=; b=ACHiaFfuK4CxBxNsp/4RW0F9WucMnB2yG00WfgkYorbsIe+1f46SrK3dI/pofXJ1Fk 0mXiujl/Y9ePm1JQfBPGj+NBdC5fqfKE4DoA6ZV16ZMsUvdAbFR3cSG5N3Xj+ZPiXzqh NM3nMdntAbANTwscat++/RuJtB78A15JblWQtofoWS7RLpstdLvqLZ02NEvOIBp7abeu B4R7AXsSlpyxNKrSUWvfry5+BnsYx9XQQk2ipek7q6miTRv6Z/l7j7ZMt9Kp9zPIXl8P 1pNRkhsqjIVfkMzoiiZf9kF9LwMvhLlhSCwyM0j1qggK/7cBEREqXhLl6UCcRouwXe0T 92aQ==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=jb5JhUiVJs66j8Dd98kXlKswrGtnT5zBqC4qP/duaTE=; b=RpmABXMxAzAt+BdsL0FUuTcERiuOU+FQwDmLrL5t/URphOF9sqKO1qdMFZq9zDraRT +bnZ1n7aFTaPFWYKf5VHUt9rqxQCdYvJ/uuS07Hpc+yRN8/hHQBae5sl1Oz1P/x74qR5 TkvVXi6ued8pLYpGIO5rxf2HEjUFNBh7MCMj/udLMQV5+vIBXibox6cDBFV6L+Sr5vZu 7XqEAhzx5T4ys1BGCuEnrWZ1JypAg6C0AFtwYe7tSiMJfoB9y4IUwoe2l6NzNSKhaHE3 1FwshSs1QGPSbZSXIARogx5TVaXyHzzC89UIGDAIXpM+f39oCL95USVruX3yaEA7iRdZ JNMQ==
X-Gm-Message-State AD7BkJI46LH7hsRhZadiiS8dm2phpEVQvysnVYaxFFDyhYK0lNZzLbKXD+47OWZvuOcAUuDdPf85dMUByuhcaQ==
X-Received by 10.107.63.139 with SMTP id m133mr21748123ioa.157.1458472965787; Sun, 20 Mar 2016 04:22:45 -0700 (PDT)
In-Reply-To <56ee8454$0$22142$c3e8da3$5496439d@news.astraweb.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:105299

Show key headers only | View raw


On Sun, Mar 20, 2016 at 10:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
> The Unicode standard does not, as far as I am aware, care how you represent
> code points in memory, only that there are 0x110000 of them, numbered from
> U+0000 to U+10FFFF. That's what I mean by abstract. The obvious
> implementation is to use 32-bit integers, where 0x00000000 represents code
> point U+0000, 0x00000001 represents U+0001, and so forth. This is
> essentially equivalent to UTF-16, but it's not mandated or specified by the
> Unicode standard, you could, if you choose, use something else.

(UTF-32)

The codepoints are not representable in *memory*; they are, by
definition, representable in a field of integers. If you choose to
represent those integers as little-endian 32-bit values, then yes, the
layout in memory will look like UTF-32LE, but that's because UTF-32LE
is defined in this extremely simple way. In fact, that's exactly how
the layers work - Unicode defines a mapping of characters to code
points, and then UTF-x defines a mapping of code points to bytes.

> On the other hand, I believe that the output of the UTF transformations is
> explicitly described in terms of 8-bit bytes and 16- or 32-bit words. For
> instance, the UTF-8 encoding of "A" has to be a single byte with value 0x41
> (decimal 65). It isn't that this is the most obvious implementation, its
> that it can't be anything else and still be UTF-8.

Exactly. Aside from the way UTF-16 and UTF-32 have LE and BE variants,
there is only one bitpattern for any given character sequence and
UTF-x (so if you work with eg "UTF-16LE", there's only one). This is
no accident. Unlike some encodings, in which there's a "one most
obvious" way to encode things but then a number of other legal ways,
UTF-x can be compared for equality [1] using simple byte-for-byte
comparisons. This means you don't have to worry about someone sneaking
a magic character past your filter; if you're checking a UTF-8 stream
for the character U+003C LESS-THAN SIGN, the only byte value to look
for is 0x3C - the sequence 0xC0 0xBC, despite mathematically
representing the number 003C, is explicitly forbidden.

ChrisA

[1] Though not inequality - lexical sorting doesn't follow codepoint
order, and codepoint order won't always match byte order. But equality
is easy.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to waste computer memory? wxjmfauth@gmail.com - 2016-03-17 07:34 -0700
  Re: How to waste computer memory? Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-17 12:21 -0700
    Re: How to waste computer memory? cl@isbd.net - 2016-03-17 20:31 +0000
      Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 07:42 +1100
        Re: How to waste computer memory? Grant Edwards <invalid@invalid.invalid> - 2016-03-17 21:08 +0000
          Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 08:13 +1100
            Re: How to waste computer memory? Paul Rubin <no.email@nospam.invalid> - 2016-03-17 14:30 -0700
          Re: How to waste computer memory? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 22:32 +0000
          Re: How to waste computer memory? cl@isbd.net - 2016-03-17 22:42 +0000
        Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 23:11 +0200
          Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 08:17 +1100
          Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-17 21:26 +0000
            Re: How to waste computer memory? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 22:38 +0000
            Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 10:02 +1100
        Re: How to waste computer memory? alister <alister.ware@ntlworld.com> - 2016-03-17 21:37 +0000
          Re: How to waste computer memory? alister <alister.ware@ntlworld.com> - 2016-03-17 21:43 +0000
          Re: How to waste computer memory? Gene Heskett <gheskett@wdtv.com> - 2016-03-17 20:51 -0400
            Re: How to waste computer memory? Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-17 18:47 -0700
            Re: How to waste computer memory? cl@isbd.net - 2016-03-18 10:44 +0000
              Re: How to waste computer memory? Gene Heskett <gheskett@wdtv.com> - 2016-03-18 10:11 -0400
              Re: How to waste computer memory? Grant Edwards <invalid@invalid.invalid> - 2016-03-19 13:50 +0000
    Re: How to waste computer memory? Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-18 01:00 -0600
      Re: How to waste computer memory? Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-18 10:26 +0200
        Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 17:26 +0200
          Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 03:58 +1100
          Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 23:02 +0200
            Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 23:28 +0200
              Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 00:03 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 09:49 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 10:22 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 11:40 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 19:38 +1100
            Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 00:14 -0700
              Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 02:17 -0700
            Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 19:14 +1100
              Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 11:31 +0200
                Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 03:40 -0700
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 13:07 +0200
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 12:24 +0000
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 14:43 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:18 +1100
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 15:14 +0000
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 15:20 +0000
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 22:32 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 14:42 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:39 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 16:56 +0200
                Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 07:01 -0700
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:56 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 17:02 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 02:47 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 18:12 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 16:01 +1100
                Re: How to waste computer memory? Rustom Mody <rustompmody@gmail.com> - 2016-03-19 23:20 -0700
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 22:06 +1100
                Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-20 22:22 +1100
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 23:14 +1100
                Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-20 23:27 +1100
                Re: How to waste computer memory? Ben Bacarisse <ben.usenet@bsb.me.uk> - 2016-03-20 14:55 +0000
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-20 17:36 +0200
                Re: How to waste computer memory? Random832 <random832@fastmail.com> - 2016-03-20 14:17 -0400
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-20 09:30 +0200
      Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 03:50 -0700
      Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-18 22:46 +1100
        Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-18 22:58 +1100
          Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 12:53 -0700
        Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 23:37 +1100
        Re: How to waste computer memory? Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-18 07:57 -0600
    Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 03:44 +1100
      Re: How to waste computer memory? Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-18 20:22 +0200
        Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 13:03 -0700
  Re: How to waste computer memory? sohcahtoa82@gmail.com - 2016-03-18 11:18 -0700

csiph-web