Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder7.xlned.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <87ior2zosv.fsf@elektro.pacujo.net>
References: <9daf0806-02de-4447-964c-c8f8953c23e5@googlegroups.com> <mailman.8377.1395456097.18130.python-list@python.org> <dcd72881-7ee2-4bc3-90ab-7912b4bea05b@googlegroups.com> <lgj4sp$7ed$1@speranza.aioe.org> <532d5bd9$0$29994$c3e8da3$5496439d@news.astraweb.com> <lgoh7j$otg$1@speranza.aioe.org> <mailman.8440.1395651801.18130.python-list@python.org> <lgq8th$nj4$1@speranza.aioe.org> <mailman.8472.1395705541.18130.python-list@python.org> <0b78649a-16b3-4410-8258-e859578d62be@googlegroups.com> <mailman.8483.1395717465.18130.python-list@python.org> <roy-66D138.23291924032014@news.panix.com> <mailman.8485.1395719463.18130.python-list@python.org> <lgquvt$b7t$1@speranza.aioe.org> <281c8ce1-4f03-4e93-b5cd-d45b85e89e7e@googlegroups.com> <mailman.8489.1395721791.18130.python-list@python.org> <feab8d4b-2f00-4f5f-ba57-2815439faecc@googlegroups.com> <mailman.8490.1395724126.18130.python-list@python.org> <lgr3c1$k9l$1@speranza.aioe.org> <mailman.8491.1395725262.18130.python-list@python.org> <87ior2zosv.fsf@elektro.pacujo.net>
Date: Tue, 25 Mar 2014 18:24:10 +0100
Subject: Re: Time we switched to unicode?
From: =?UTF-8?B?Q2hyaXMg4oCcS3dwb2xza2HigJ0gV2Fycmljaw==?= <kwpolska@gmail.com>
To: Marko Rauhamaa <marko@pacujo.net>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: python-list@python.org
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.8528.1395768259.18130.python-list@python.org>
Lines: 69
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:69044

On Tue, Mar 25, 2014 at 9:05 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> On Tue, Mar 25, 2014 at 4:14 PM, Mark H Harris <harrismh777@gmail.com> w=
rote:
>>>>>> =CE=A0=C2=B9 =3D pi
>>
>> That's good! (Although typing =CE=A0=C2=B9 quicker than pi is majorly pu=
shing it.
>
> It don't think that's good. The lower-case letter =CF=80=C2=B2 should be =
used. The
> upper-case letter is used for a product, although unicode dedicates a
> separate character for the purpose: =E2=88=8F=C2=B3.
>
> I often see Americans, especially, confuse upper and lower-case letters
> in symbols ("KM" for "km", "L" for "l" etc).


=E2=80=9CL=E2=80=9D is actually valid, and so is =E2=80=9Cl=E2=80=9D.  This=
 happens mainly because
humans (and computers) tend to write =E2=80=9C1 l=E2=80=9D (one liter, one-=
ell) in a
way that makes it harder to distinguish (becoming eleven or ell-ell),
especially if you don=E2=80=99t include the space (which is invalid).

On Tue, Mar 25, 2014 at 9:23 AM, Chris Angelico <rosuav@gmail.com> wrote:
> If you can type a capital =E2=88=8F=C2=B3, you can type a lower-case =CF=
=80=C2=B2, unless there's something very weird going on.

Nitpick time!  (because we all love it so much!)

=CE=A0=C2=B9 =3D U+03A0 GREEK CAPITAL LETTER PI
=CF=80=C2=B2 =3D U+03C0 GREEK SMALL LETTER PI
=E2=88=8F=C2=B3 =3D U+220F N-ARY PRODUCT

=E2=80=9CIf you can type an N-ARY PRODUCT, you can type a GREEK SMALL LETTE=
R
PI, unless there=E2=80=99s something very weird going on.=E2=80=9D

=E2=80=A6like, the user is in the past and is using ISO 8859-7 (instead of =
a
21st-century encoding, like UTF-8).  An encoding which has support for
=CE=A0=C2=B9 and =CF=80=C2=B2, but not for =E2=88=8F=C2=B3=E2=80=A6 (of cou=
rse, this assumes that, if we add
those new characters into python, we allow any encoding, somehow.)

That=E2=80=99s not too weird, other than the ancient encoding being used.
(though that=E2=80=99s a bit less weird on Windows, but that=E2=80=99d be
Windows-1253.)

Oh: and speaking of fancy Unicode characters that are worthless
~duplicates, spot the difference here:

=C2=B5 =CE=BC

If you are lucky enough (and, luckiness may involve reading this
e-mail in Helvetica (not Neue though) on a Mac), you can clearly see
that they are different.  If you are using a font that does not
differentiate them, you may think they=E2=80=99re the same.  If you ask som=
e
intelligent software (like `unicodedata.name()` in Python), you=E2=80=99ll
quickly find out the first is MICRO SIGN, and the other is GREEK SMALL
LETTER MU.  Such craziness is what makes Unicode Unicode.

--=20
Chris =E2=80=9CKwpolska=E2=80=9D Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense