Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #35158
| Path | csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <wxjmfauth@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.003 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'context': 0.05; 'python': 0.09; 'forcing': 0.09; 'pep': 0.09; 'subject:()': 0.09; 'to:addr:comp.lang.python': 0.09; 'cc:addr:python-list': 0.10; 'stored': 0.10; "wouldn't": 0.11; 'dec': 0.15; '8:40': 0.16; 'bug,': 0.16; 'cares': 0.16; 'dump': 0.16; 'non-english': 0.16; 'storing': 0.16; 'subject:3.3': 0.16; 'subject:unicode': 0.16; 'unfair': 0.16; 'wider': 0.16; 'wed,': 0.16; 'string': 0.17; 'wrote:': 0.17; 'bytes': 0.17; 'unicode': 0.17; '>>>': 0.18; 'memory': 0.18; 'platforms': 0.18; 'trying': 0.21; '3.2': 0.22; 'cc:2**0': 0.23; "i've": 0.23; 'linux': 0.24; 'least': 0.25; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; '----': 0.27; 'am,': 0.27; 'fixed': 0.28; 'actual': 0.28; 'chris': 0.28; 'character.': 0.29; 'represented': 0.29; 'strings,': 0.29; 'thinks': 0.29; '8bit%:5': 0.29; 'code': 0.31; 'anybody': 0.32; 'builds': 0.33; "he's": 0.33; 'problem': 0.33; 'everyone': 0.33; 'received:google.com': 0.34; 'compared': 0.35; 'especially': 0.35; 'doing': 0.35; 'received:209.85': 0.35; 'alone': 0.36; 'characters': 0.36; 'enough': 0.36; 'optimization': 0.37; 'does': 0.37; 'rather': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'things': 0.38; 'sure': 0.38; 'build': 0.39; 'space': 0.39; 'think': 0.40; 'your': 0.60; 'from:no real name:2**0': 0.60; 'skip:u 10': 0.60; 'most': 0.61; 'subject:, ': 0.61; 'containing': 0.61; 'solve': 0.62; 'different': 0.63; 'more': 0.63; 'our': 0.65; 'him,': 0.66; '>from': 0.75; 'counts': 0.81; 'all;': 0.84; 'complaint': 0.84; 'moral': 0.84; 'ocean.': 0.84 |
| Newsgroups | comp.lang.python |
| Date | Wed, 19 Dec 2012 13:18:05 -0800 (PST) |
| In-Reply-To | <mailman.1068.1355941696.29569.python-list@python.org> |
| Complaints-To | groups-abuse@google.com |
| Injection-Info | glegroupsg2000goo.googlegroups.com; posting-host=178.198.163.217; posting-account=ung4FAoAAAC46zhHJ0Nsnuox7M5gDvs_ |
| References | <2adb4a25-8ea3-441f-b8c0-ee6c87e4b19f@googlegroups.com> <kaslsb$iue$1@news.albasani.net> <CAPTjJmrLAe0i9rW6sCYkYBvpiPk2O=FHB0PgSq1dqNqh9Y7Zqg@mail.gmail.com> <mailman.1068.1355941696.29569.python-list@python.org> |
| User-Agent | G2/1.0 |
| X-Google-Web-Client | true |
| X-Google-IP | 178.198.163.217 |
| MIME-Version | 1.0 |
| Subject | Re: Py 3.3, unicode / upper() |
| From | wxjmfauth@gmail.com |
| To | comp.lang.python@googlegroups.com |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | quoted-printable |
| Cc | Python <python-list@python.org> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Message-ID | <mailman.1073.1355951888.29569.python-list@python.org> (permalink) |
| Lines | 72 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1355951888 news.xs4all.nl 6851 [2001:888:2000:d::a6]:40267 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:35158 |
Show key headers only | View raw
Le mercredi 19 décembre 2012 19:27:38 UTC+1, Ian a écrit :
> On Wed, Dec 19, 2012 at 8:40 AM, Chris Angelico <rosuav@gmail.com> wrote:
>
> > You may not be familiar with jmf. He's one of our resident trolls, and
>
> > he has a bee in his bonnet about PEP 393 strings, on the basis that
>
> > they take up more space in memory than a narrow build of Python 3.2
>
> > would, for a string with lots of BMP characters and one non-BMP. In
>
> > 3.2 narrow builds, strings were stored in UTF-16, with *surrogate
>
> > pairs* for non-BMP characters. This means that len() counts them
>
> > twice, as does string indexing/slicing. That's a major bug, especially
>
> > as your Python code will do different things on different platforms -
>
> > most Linux builds of 3.2 are "wide" builds, storing characters in four
>
> > bytes each.
>
>
>
> >From what I've been able to discern, his actual complaint about PEP
>
> 393 stems from misguided moral concerns. With PEP-393, strings that
>
> can be fully represented in Latin-1 can be stored in half the space
>
> (ignoring fixed overhead) compared to strings containing at least one
>
> non-Latin-1 character. jmf thinks this optimization is unfair to
>
> non-English users and immoral; he wants Latin-1 strings to be treated
>
> exactly like non-Latin-1 strings (I don't think he actually cares
>
> about non-BMP strings at all; if narrow-build Unicode is good enough
>
> for him, then it must be good enough for everybody). Unfortunately
>
> for him, the Latin-1 optimization is rather trivial in the wider
>
> context of PEP-393, and simply removing that part alone clearly
>
> wouldn't be doing anybody any favors. So for him to get what he
>
> wants, the entire PEP has to go.
>
>
>
> It's rather like trying to solve the problem of wealth disparity by
>
> forcing everyone to dump their excess wealth into the ocean.
----
latin-1 (iso-8859-1) ? are you sure ?
>>> sys.getsizeof('a')
26
>>> sys.getsizeof('ab')
27
>>> sys.getsizeof('aé')
39
Time to go to bed. More complete answer tomorrow.
jmf
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 06:23 -0800
Re: Py 3.3, unicode / upper() Thomas Bach <thbach@students.uni-mainz.de> - 2012-12-19 15:43 +0100
Re: Py 3.3, unicode / upper() Christian Heimes <christian@python.org> - 2012-12-19 15:52 +0100
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 12:55 -0800
Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 14:23 -0700
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:42 -0800
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:42 -0800
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 13:01 +1100
Re: Py 3.3, unicode / upper() Westley Martínez <anikom15@gmail.com> - 2012-12-19 18:53 -0800
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 12:55 -0800
Re: Py 3.3, unicode / upper() Stefan Krah <stefan-usenet@bytereef.org> - 2012-12-19 16:01 +0100
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 02:17 +1100
Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-19 16:18 +0100
Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-19 16:22 +0100
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 02:40 +1100
Re: Py 3.3, unicode / upper() Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-12-20 15:57 +0100
Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 11:27 -0700
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 13:18 -0800
Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-19 14:31 -0700
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:40 -0800
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:48 -0500
Re: Py 3.3, unicode / upper() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-20 22:51 +0000
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:40 -0800
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-19 13:18 -0800
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-19 19:39 -0500
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 13:03 +1100
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-19 21:54 -0500
Re: Py 3.3, unicode / upper() Westley Martínez <anikom15@gmail.com> - 2012-12-19 19:12 -0800
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-20 14:22 +1100
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 00:32 -0500
Re: Py 3.3, unicode / upper() Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-20 05:51 +0000
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:57 -0800
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:30 -0500
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:57 -0800
Re: Py 3.3, unicode / upper() Serhiy Storchaka <storchaka@gmail.com> - 2012-12-27 21:00 +0200
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-27 11:36 -0800
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-27 11:36 -0800
Re: Py 3.3, unicode / upper() Christian Heimes <christian@python.org> - 2012-12-19 16:33 +0100
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-29 11:16 -0800
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-29 11:16 -0800
Re: Py 3.3, unicode / upper() Benjamin Peterson <benjamin@python.org> - 2012-12-19 20:25 +0000
Re: Py 3.3, unicode / upper() wxjmfauth@gmail.com - 2012-12-20 11:19 -0800
Re: Py 3.3, unicode / upper() MRAB <python@mrabarnett.plus.com> - 2012-12-20 20:20 +0000
Re: Py 3.3, unicode / upper() Chris Angelico <rosuav@gmail.com> - 2012-12-21 08:19 +1100
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:12 -0500
Re: Py 3.3, unicode / upper() Terry Reedy <tjreedy@udel.edu> - 2012-12-20 17:59 -0500
Re: Py 3.3, unicode / upper() Ian Kelly <ian.g.kelly@gmail.com> - 2012-12-20 17:34 -0700
csiph-web