Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!cs.uu.nl!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Newsgroups: comp.lang.python
Date: Thu, 20 Dec 2012 11:40:21 -0800 (PST)
In-Reply-To: <mailman.1075.1355952735.29569.python-list@python.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=178.198.163.217; posting-account=ung4FAoAAAC46zhHJ0Nsnuox7M5gDvs_
References: <2adb4a25-8ea3-441f-b8c0-ee6c87e4b19f@googlegroups.com> <kaslsb$iue$1@news.albasani.net> <CAPTjJmrLAe0i9rW6sCYkYBvpiPk2O=FHB0PgSq1dqNqh9Y7Zqg@mail.gmail.com> <mailman.1068.1355941696.29569.python-list@python.org> <1fb2010e-73e4-4025-bb93-12ce7992ddab@googlegroups.com> <mailman.1075.1355952735.29569.python-list@python.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Subject: Re: Py 3.3, unicode / upper()
From: wxjmfauth@gmail.com
To: comp.lang.python@googlegroups.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: Python <python-list@python.org>
Precedence: list
Message-ID: <mailman.1110.1356037281.29569.python-list@python.org>
Lines: 64
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:35225

Le mercredi 19 d=E9cembre 2012 22:31:42 UTC+1, Ian a =E9crit=A0:
> On Wed, Dec 19, 2012 at 2:18 PM,  <wxjmfauth@gmail.com> wrote:
>=20
> > latin-1 (iso-8859-1) ? are you sure ?
>=20
>=20
>=20
> Yes.
>=20
>=20
>=20
> >>>> sys.getsizeof('a')
>=20
> > 26
>=20
> >>>> sys.getsizeof('ab')
>=20
> > 27
>=20
> >>>> sys.getsizeof('a=E9')
>=20
> > 39
>=20
>=20
>=20
> Compare to:
>=20
>=20
>=20
> >>> sys.getsizeof('a\u0100')
>=20
> 42
>=20
>=20
>=20
> The reason for the difference you posted is that pure ASCII strings
>=20
> have a further optimization, which I glossed over and which is purely
>=20
> a savings in overhead:
>=20
>=20
>=20
> >>> sys.getsizeof('abcde') - sys.getsizeof('a')
>=20
> 4
>=20
> >>> sys.getsizeof('=E1b=E7d=EA') - sys.getsizeof('=E1')
>=20
> 4

-----

I know all of this. And this is exactly, what I explained.
I do not care about this optimization. I'm not an ascii user.
As a non ascii user, this optimization is just irrelevant.

What should a Python user think, if he sees his strings
are comsuming more memory just because he uses non ascii
characters or he sees his strings are changing just because
he "uppercases" them.
Unicode is here to serve anybody.

jmf