Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #105213

Re: How to waste computer memory?

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Ian Kelly <ian.g.kelly@gmail.com>
Newsgroups comp.lang.python
Subject Re: How to waste computer memory?
Date Fri, 18 Mar 2016 07:57:57 -0600
Lines 49
Message-ID <mailman.318.1458309520.12893.python-list@python.org> (permalink)
References <a2639027-c69c-46df-a7a5-45a677b9e01d@googlegroups.com> <265377f4-741d-4aa2-9338-239f56f8bc57@googlegroups.com> <mailman.302.1458284448.12893.python-list@python.org> <56ebea83$0$1599$c3e8da3$5496439d@news.astraweb.com> <CAPTjJmoxXh2+894LjcVjPJ-qP=bJRWEeeSXU_3Dn673+0hyLJQ@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-Trace news.uni-berlin.de 3YDk4dGR7ywB/3ioR/lKEApk16aPqzGlP8hDU4E1a1XQ==
Return-Path <ian.g.kelly@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'position,': 0.04; 'imply': 0.07; 'indexing': 0.07; 'utf-8': 0.07; 'width': 0.07; 'subject:How': 0.09; 'bytes,': 0.09; 'indexes': 0.09; 'non-ascii': 0.09; 'themselves,': 0.09; 'yeah,': 0.09; 'index': 0.13; '(assuming': 0.16; '2016': 0.16; '255': 0.16; 'locating': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'string:': 0.16; 'wrote:': 0.16; 'memory': 0.17; 'string': 0.17; 'byte': 0.18; 'bytes': 0.18; 'have:': 0.18; 'instance,': 0.18; 'string,': 0.18; '(in': 0.18; 'am,': 0.23; 'header:In-Reply-To:1': 0.24; "doesn't": 0.26; 'chris': 0.26; 'point.': 0.27; 'points': 0.27; 'fri,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'grouping': 0.29; 'table,': 0.29; 'array': 0.29; 'character': 0.29; '(including': 0.30; 'that.': 0.30; 'code': 0.30; 'point': 0.33; 'consist': 0.33; "d'aprano": 0.33; 'point,': 0.33; 'steven': 0.33; '(for': 0.34; 'add': 0.34; 'received:google.com': 0.35; 'could': 0.35; '8bit%:86': 0.35; 'skip:e 40': 0.35; 'supports': 0.35; 'but': 0.36; 'too': 0.36; 'instead': 0.36; 'received:209.85': 0.36; 'to:addr:python-list': 0.36; 'subject:?': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'received:209': 0.38; 'to:addr:python.org': 0.40; 'still': 0.40; 'some': 0.40; 'your': 0.60; 'total': 0.62; 'more': 0.63; 'mar': 0.65; 'series': 0.65; 'store,': 0.66; 'worth': 0.67; 'characters.)': 0.84; 'rue': 0.84; 'to:name:python': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=isda00/o1ofzYQz+CrAiLV38NzG8uCc09tec1FGjviE=; b=DPEFt6zYSM55yT1A6gVgWpCyd6G7zJrVsyZsBsDka+A3lJaE38eDmsXfrPyVqWRRNg EyczHUzdHvB2WO0KUwLJL3+Z2l7jKhAXvBubHO8wFc7AhBv4unYyABvGTM2ZW4/iwg/A quIoCv4ckjyTvBFexJxuDZI35cGZNZ9X7a7cpS+3JvLbUg409PGSc9wfhWhgzdsOrN1G Xand8i5RSMYt6ZSI8ssH9bdAL3LkW7wZqxTMbK7WOCduk6nkBTcGb3Gx8DMmgWLp0YMa jhOlCj4My7QBDDKuwLEtZP8obWgIbLzBrpIHOnfVWBuRrwksrNfmJqL/i25nvFYv+Gg1 rCbg==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=isda00/o1ofzYQz+CrAiLV38NzG8uCc09tec1FGjviE=; b=WXWCOLqXZLB5r/h9hIyKm/tGONEQ2+D+vOzQe472xuA2fYI/rPjeBo6d/PbO2cZQTV IHvVTdJLYkbv0/dQcFz46+vQgEYE/B409dk5fSBo1KvemoKkMDeosAjK5nXy3E65J8Z4 hcCx6J5s0BHDCf9B8xdbKh3L/AgLo0OZaHa0cEsjNYOJVm7itaN8GjWJ3Glf5yVJ9w4h dfzqprZCAIOC9eJuEzcThewxi3kBrGZJ1YorA7cQhIz9zPKp6yG2Ojw4cJRiVgyA5Z/P qrumiqRFFHiyAU0Kp7q9WvcFg3SiM7d4Z+hYA1HOh9ebGJejT/2VwEyRvWpLkXTCs4a7 2cMg==
X-Gm-Message-State AD7BkJLIwNQLThBmMf06ut/bPB3qT/JoKWHgSw3qe0nQJGTlQgOhWQFD5leWSeU7WDgjzElr53jnCxY3JeWWeQ==
X-Received by 10.107.19.140 with SMTP id 12mr17838005iot.11.1458309517318; Fri, 18 Mar 2016 06:58:37 -0700 (PDT)
In-Reply-To <CAPTjJmoxXh2+894LjcVjPJ-qP=bJRWEeeSXU_3Dn673+0hyLJQ@mail.gmail.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.21
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:105213

Show key headers only | View raw


On Fri, Mar 18, 2016 at 6:37 AM, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Mar 18, 2016 at 10:46 PM, Steven D'Aprano <steve@pearwood.info> wrote:
>> Technically, UTF-8 doesn't *necessarily* imply indexing is O(n). For
>> instance, your UTF-8 string might consist of an array of bytes containing
>> the string, plus an array of indexes to the start of each code point. For
>> example, the string:
>>
>> “abcπßЊ•𒀁”
>>
>> (including the quote marks) is 10 code points in length and 22 bytes as
>> UTF-8. Grouping the (hex) bytes for each code point, we have:
>>
>> e2809c 61 62 63 cf80 c39f d08a e280a2 f0928081 e2809d
>>
>> so we could get a O(1) UTF-8 string by recording the bytes (in hex) plus the
>> indexes (in decimal) in which each code point starts:
>>
>> e2809c616263cf80c39fd08ae280a2f0928081e2809d
>>
>> 0 3 4 5 6 8 10 12 15 19
>>
>> but (assuming each index needs 2 bytes, which supports strings up to 65535
>> characters in length), that's actually LESS memory efficient than UTF-32:
>> 42 bytes versus 40.
>
> A lot of strings will have no more than 255 non-ASCII characters in
> them. (For example, all strings which no more than 255 total
> characters.) You could store, instead of the indexes themselves, a
> series of one-byte offsets:
>
> e2809c616263cf80c39fd08ae280a2f0928081e2809d
> 0 2 2 2 2 3 4 5 7 10
>
> Locating a byte based on its character position is still O(1); you
> look up that position in the offset table, add that to your original
> character position, and you have the byte location. For strings with
> too many non-ASCII codepoints, you'd need some other representation,
> but at that point, it might be worth just switching to UTF-32.

So this uses approximately twice as much memory as the FSR and still
requires switching on some form of character width in the
implementation? Yeah, I don't think the RUE is going to go for that.
8-)

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to waste computer memory? wxjmfauth@gmail.com - 2016-03-17 07:34 -0700
  Re: How to waste computer memory? Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-17 12:21 -0700
    Re: How to waste computer memory? cl@isbd.net - 2016-03-17 20:31 +0000
      Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 07:42 +1100
        Re: How to waste computer memory? Grant Edwards <invalid@invalid.invalid> - 2016-03-17 21:08 +0000
          Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 08:13 +1100
            Re: How to waste computer memory? Paul Rubin <no.email@nospam.invalid> - 2016-03-17 14:30 -0700
          Re: How to waste computer memory? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 22:32 +0000
          Re: How to waste computer memory? cl@isbd.net - 2016-03-17 22:42 +0000
        Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 23:11 +0200
          Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 08:17 +1100
          Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-17 21:26 +0000
            Re: How to waste computer memory? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 22:38 +0000
            Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 10:02 +1100
        Re: How to waste computer memory? alister <alister.ware@ntlworld.com> - 2016-03-17 21:37 +0000
          Re: How to waste computer memory? alister <alister.ware@ntlworld.com> - 2016-03-17 21:43 +0000
          Re: How to waste computer memory? Gene Heskett <gheskett@wdtv.com> - 2016-03-17 20:51 -0400
            Re: How to waste computer memory? Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-17 18:47 -0700
            Re: How to waste computer memory? cl@isbd.net - 2016-03-18 10:44 +0000
              Re: How to waste computer memory? Gene Heskett <gheskett@wdtv.com> - 2016-03-18 10:11 -0400
              Re: How to waste computer memory? Grant Edwards <invalid@invalid.invalid> - 2016-03-19 13:50 +0000
    Re: How to waste computer memory? Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-18 01:00 -0600
      Re: How to waste computer memory? Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-18 10:26 +0200
        Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 17:26 +0200
          Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 03:58 +1100
          Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 23:02 +0200
            Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-18 23:28 +0200
              Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 00:03 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 09:49 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 10:22 +0200
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 11:40 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 19:38 +1100
            Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 00:14 -0700
              Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 02:17 -0700
            Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 19:14 +1100
              Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 11:31 +0200
                Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 03:40 -0700
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 13:07 +0200
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 12:24 +0000
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 14:43 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:18 +1100
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 15:14 +0000
                Re: How to waste computer memory? BartC <bc@freeuk.com> - 2016-03-19 15:20 +0000
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 22:32 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 14:42 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:39 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 16:56 +0200
                Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-19 07:01 -0700
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 01:56 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 17:02 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 02:47 +1100
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-19 18:12 +0200
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 16:01 +1100
                Re: How to waste computer memory? Rustom Mody <rustompmody@gmail.com> - 2016-03-19 23:20 -0700
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 22:06 +1100
                Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-20 22:22 +1100
                Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-20 23:14 +1100
                Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-20 23:27 +1100
                Re: How to waste computer memory? Ben Bacarisse <ben.usenet@bsb.me.uk> - 2016-03-20 14:55 +0000
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-20 17:36 +0200
                Re: How to waste computer memory? Random832 <random832@fastmail.com> - 2016-03-20 14:17 -0400
                Re: How to waste computer memory? Marko Rauhamaa <marko@pacujo.net> - 2016-03-20 09:30 +0200
      Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 03:50 -0700
      Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-18 22:46 +1100
        Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-18 22:58 +1100
          Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 12:53 -0700
        Re: How to waste computer memory? Chris Angelico <rosuav@gmail.com> - 2016-03-18 23:37 +1100
        Re: How to waste computer memory? Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-18 07:57 -0600
    Re: How to waste computer memory? Steven D'Aprano <steve@pearwood.info> - 2016-03-19 03:44 +1100
      Re: How to waste computer memory? Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-18 20:22 +0200
        Re: How to waste computer memory? wxjmfauth@gmail.com - 2016-03-18 13:03 -0700
  Re: How to waste computer memory? sohcahtoa82@gmail.com - 2016-03-18 11:18 -0700

csiph-web