Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.017 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'python': 0.09; 'pep': 0.09; 'subject:string': 0.09; 'to:addr:comp.lang.python': 0.09; 'cc:addr:python-list': 0.10; 'suggest': 0.11; '"this': 0.13; 'languages.': 0.15; '"le': 0.16; 'antoine': 0.16; 'pitrou': 0.16; 'subject:unicode': 0.16; '>>>': 0.18; 'code,': 0.18; 'import': 0.21; 'os,': 0.22; 'satisfying': 0.22; 'cc:2**0': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In- Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'coding': 0.27; 'thoughts': 0.27; 'actual': 0.28; 'character.': 0.29; 'character': 0.29; 'maybe': 0.29; 'implement': 0.32; 'knowledge': 0.33; 'requirements': 0.33; 'received:google.com': 0.34; 'received:209.85': 0.35; 'something': 0.35; 'there': 0.35; 'but': 0.36; 'characters': 0.36; 'received:209': 0.37; 'subject:: ': 0.38; 'skip:l 20': 0.38; 'from:no real name:2**0': 0.60; 'real': 0.61; 'subject:, ': 0.61; 'spending': 0.61; 'world': 0.63; 'subject:...': 0.63; 'charset:windows-1252': 0.65; '"they': 0.84; 'algorithm,': 0.84; 'ask,': 0.84; 'subject:, ...': 0.84 Newsgroups: comp.lang.python Date: Sun, 2 Sep 2012 00:36:50 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=83.79.166.230; posting-account=ung4FAoAAAC46zhHJ0Nsnuox7M5gDvs_ References: <1cb3f062-eb45-4b0c-977b-76afb099923c@googlegroups.com> <503a0d51$0$6574$c3e8da3$5496439d@news.astraweb.com> <503a8361$0$6574$c3e8da3$5496439d@news.astraweb.com> <2e92da71-fbd2-467f-9088-1c79fa7bcf69@googlegroups.com> <62566024-df1d-4948-a27a-45c7820ddc6c@googlegroups.com> <503f0e45$0$9416$c3e8da3$76491128@news.astraweb.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 83.79.166.230 MIME-Version: 1.0 Subject: Re: Flexible string representation, unicode, typography, ... From: wxjmfauth@gmail.com To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 39 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346571419 news.xs4all.nl 6841 [2001:888:2000:d::a6]:44199 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28246 Le jeudi 30 ao=FBt 2012 17:01:50 UTC+2, Antoine Pitrou a =E9crit=A0: >=20 >=20 > I honestly suggest you shut up until you have a clue. >=20 D=E9sol=E9 Antoine, I have not the knowledge to dive in the Python code, but I know what is a character. The coding of the characters is a domain per se, independent from the os, from the computer languages. Before spending time to implement a new algorithm, maybe it is better to ask, if there is something better than the actual schemes. I still remember my thoughts when I read the PEP 393 discussion: "this is not logical", "they do no understand typography", "atomic character ???", ... Real world exemples. >>> import libfrancais >>> li =3D ['no=EBl', 'noir', 'n=9Cud', 'noduleux', \ ... 'no=E9tique', 'no=E8se', 'noir=E2tre'] >>> r =3D libfrancais.sortfr(li) >>> r ['noduleux', 'no=EBl', 'no=E8se', 'no=E9tique', 'n=9Cud', 'noir', 'noir=E2tre'] (cf "Le Petit Robert") or The *letters* satisfying the requirements of the "Imprimerie nationale". jmf