Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #35147

Re: Py 3.3, unicode / upper()

Path csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <ian.g.kelly@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.008
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'context': 0.05; 'python': 0.09; 'forcing': 0.09; 'pep': 0.09; 'subject:()': 0.09; 'stored': 0.10; "wouldn't": 0.11; 'dec': 0.15; '8:40': 0.16; 'bug,': 0.16; 'cares': 0.16; 'dump': 0.16; 'non-english': 0.16; 'storing': 0.16; 'subject:3.3': 0.16; 'subject:unicode': 0.16; 'unfair': 0.16; 'wider': 0.16; 'wed,': 0.16; 'string': 0.17; 'wrote:': 0.17; 'bytes': 0.17; 'unicode': 0.17; 'memory': 0.18; 'platforms': 0.18; 'trying': 0.21; '3.2': 0.22; "i've": 0.23; 'linux': 0.24; 'least': 0.25; 'header:In-Reply-To:1': 0.25; 'am,': 0.27; 'message- id:@mail.gmail.com': 0.27; 'fixed': 0.28; 'actual': 0.28; 'chris': 0.28; 'character.': 0.29; 'represented': 0.29; 'strings,': 0.29; 'thinks': 0.29; 'code': 0.31; 'anybody': 0.32; 'builds': 0.33; "he's": 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; 'everyone': 0.33; 'received:google.com': 0.34; 'compared': 0.35; 'especially': 0.35; 'doing': 0.35; 'received:209.85': 0.35; 'alone': 0.36; 'characters': 0.36; 'enough': 0.36; 'optimization': 0.37; 'does': 0.37; 'rather': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'things': 0.38; 'to:addr:python.org': 0.39; 'build': 0.39; 'space': 0.39; 'header:Received:5': 0.40; 'think': 0.40; 'your': 0.60; 'skip:u 10': 0.60; 'most': 0.61; 'subject:, ': 0.61; 'containing': 0.61; 'solve': 0.62; 'different': 0.63; 'more': 0.63; 'our': 0.65; 'him,': 0.66; 'counts': 0.81; 'all;': 0.84; 'complaint': 0.84; 'moral': 0.84; 'ocean.': 0.84; 'to:name:python': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=xgTDHUr/gFo++RtwIGSU8ns6ALBUInwHuckVhaYeGrU=; b=IZZyz9Hh1DaYMZeNnvgKad7DurLu2cGp1YkBlR4JKQbH1Q9S8pAzllyZJC52JFCp8T Qfm6ehEf7moJASDROqcucRJUkyFbL6rb3qz9PH6UAh5poav3dzAvBIUNnoaQtdpj8CAY OhMTvrqqauvMYOHy+vAMfy8QhyMaKyjAY8puL4vjIfAXGkfev96Gf8txNTdHxfYP9TxM 42yzeY0c8xyg5IgS6sGA4HF1tBGJ3eOw6duzc53FkzNWNf9IGCKTSHTynO6CieMtSRvJ Mkj3WCpNBQdXUqK8oxThqq7k2YbCuMj91zIwAs+O8MYkHO7I72O9SQuKFAfNtvqRK5mv i4RQ==
MIME-Version 1.0
In-Reply-To <CAPTjJmrLAe0i9rW6sCYkYBvpiPk2O=FHB0PgSq1dqNqh9Y7Zqg@mail.gmail.com>
References <2adb4a25-8ea3-441f-b8c0-ee6c87e4b19f@googlegroups.com> <kaslsb$iue$1@news.albasani.net> <CAPTjJmrLAe0i9rW6sCYkYBvpiPk2O=FHB0PgSq1dqNqh9Y7Zqg@mail.gmail.com>
From Ian Kelly <ian.g.kelly@gmail.com>
Date Wed, 19 Dec 2012 11:27:38 -0700
Subject Re: Py 3.3, unicode / upper()
To Python <python-list@python.org>
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1068.1355941696.29569.python-list@python.org> (permalink)
Lines 28
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1355941696 news.xs4all.nl 6981 [2001:888:2000:d::a6]:53942
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:35147

Show key headers only | View raw


On Wed, Dec 19, 2012 at 8:40 AM, Chris Angelico <rosuav@gmail.com> wrote:
> You may not be familiar with jmf. He's one of our resident trolls, and
> he has a bee in his bonnet about PEP 393 strings, on the basis that
> they take up more space in memory than a narrow build of Python 3.2
> would, for a string with lots of BMP characters and one non-BMP. In
> 3.2 narrow builds, strings were stored in UTF-16, with *surrogate
> pairs* for non-BMP characters. This means that len() counts them
> twice, as does string indexing/slicing. That's a major bug, especially
> as your Python code will do different things on different platforms -
> most Linux builds of 3.2 are "wide" builds, storing characters in four
> bytes each.

>From what I've been able to discern, his actual complaint about PEP
393 stems from misguided moral concerns.  With PEP-393, strings that
can be fully represented in Latin-1 can be stored in half the space
(ignoring fixed overhead) compared to strings containing at least one
non-Latin-1 character.  jmf thinks this optimization is unfair to
non-English users and immoral; he wants Latin-1 strings to be treated
exactly like non-Latin-1 strings (I don't think he actually cares
about non-BMP strings at all; if narrow-build Unicode is good enough
for him, then it must be good enough for everybody).  Unfortunately
for him, the Latin-1 optimization is rather trivial in the wider
context of PEP-393, and simply removing that part alone clearly
wouldn't be doing anybody any favors.  So for him to get what he
wants, the entire PEP has to go.

It's rather like trying to solve the problem of wealth disparity by
forcing everyone to dump their excess wealth into the ocean.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 06:23 -0800
  Re: Py 3.3, unicode / upper() Thomas Bach <thbach@students.uni-mainz.de> - 2012-12-19 15:43 +0100
  Re: Py 3.3, unicode / upper() Christian Heimes <christian@python.org> - 2012-12-19 15:52 +0100
    Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 12:55 -0800
      Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 14:23 -0700
        Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:42 -0800
        Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:42 -0800
      Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 13:01 +1100
      Re: Py 3.3, unicode / upper() Westley Martínez <anikom15@gmail.com> - 2012-12-19 18:53 -0800
    Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 12:55 -0800
  Re: Py 3.3, unicode / upper() Stefan Krah <stefan-usenet@bytereef.org> - 2012-12-19 16:01 +0100
  Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 02:17 +1100
  Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-19 16:18 +0100
    Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-19 16:22 +0100
    Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 02:40 +1100
      Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-20 15:57 +0100
    Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 11:27 -0700
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 13:18 -0800
        Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 14:31 -0700
          Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:40 -0800
            Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:48 -0500
            Re: Py 3.3, unicode / upper() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-20 22:51 +0000
          Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:40 -0800
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 13:18 -0800
    Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-19 19:39 -0500
    Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 13:03 +1100
    Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-19 21:54 -0500
    Re: Py 3.3, unicode / upper() Westley Martínez <anikom15@gmail.com> - 2012-12-19 19:12 -0800
    Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 14:22 +1100
    Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 00:32 -0500
      Re: Py 3.3, unicode / upper() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-20 05:51 +0000
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:57 -0800
        Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:30 -0500
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:57 -0800
    Re: Py 3.3, unicode / upper() Serhiy Storchaka <storchaka@gmail.com> - 2012-12-27 21:00 +0200
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-27 11:36 -0800
      Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-27 11:36 -0800
  Re: Py 3.3, unicode / upper() Christian Heimes <christian@python.org> - 2012-12-19 16:33 +0100
    Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-29 11:16 -0800
    Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-29 11:16 -0800
  Re: Py 3.3, unicode / upper() Benjamin Peterson <benjamin@python.org> - 2012-12-19 20:25 +0000
  Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:19 -0800
    Re: Py 3.3, unicode / upper() MRAB <python@mrabarnett.plus.com> - 2012-12-20 20:20 +0000
    Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-21 08:19 +1100
    Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:12 -0500
    Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:59 -0500
    Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-20 17:34 -0700

csiph-web