Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'intermediate': 0.05; 'escape': 0.07; 'python': 0.09; 'fails.': 0.09; 'subject:string': 0.09; 'to:addr:comp.lang.python': 0.09; 'cc:addr:python-list': 0.10; ';-)': 0.11; 'encoding': 0.15; 'cases:': 0.16; 'ebcdic,': 0.16; 'nail': 0.16; 'rule.': 0.16; 'subject:unicode': 0.16; 'think?': 0.16; 'unsupported': 0.16; 'worst': 0.16; 'string': 0.17; 'wrote:': 0.17; 'unicode': 0.17; 'solution.': 0.18; 'memory': 0.18; 'discussion': 0.20; 'stick': 0.22; 'flexibility': 0.23; 'idea': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply- To:1': 0.25; 'header:User-Agent:1': 0.26; 'am,': 0.27; 'cc:addr:gmail.com': 0.27; 'coding': 0.27; 'cc:2**2': 0.27; 'interface': 0.27; 'rest': 0.28; '"python"': 0.29; 'equivalent.': 0.29; 'proposing': 0.29; 'case,': 0.29; "i'm": 0.29; 'code': 0.31; 'handle': 0.33; 'problem': 0.33; 'another': 0.33; 'received:google.com': 0.34; 'received:209.85': 0.35; 'but': 0.36; 'depends': 0.36; 'level.': 0.36; 'unable': 0.36; 'possible': 0.37; 'does': 0.37; 'why': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'performance': 0.39; 'little': 0.39; 'where': 0.40; 'your': 0.60; 'from:no real name:2**0': 0.60; 'most': 0.61; 'subject:, ': 0.61; 'share': 0.61; 'more.': 0.62; 'maximum': 0.63; 'subject:...': 0.63; 'email addr:gmail.com': 0.63; 'nonsense.': 0.84; 'serious.': 0.84; 'subject:, ...': 0.84; 'cc:no real name:2**2': 0.91; 'angel': 0.93 Newsgroups: comp.lang.python Date: Wed, 29 Aug 2012 08:43:05 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=62.203.125.238; posting-account=ung4FAoAAAC46zhHJ0Nsnuox7M5gDvs_ References: <1cb3f062-eb45-4b0c-977b-76afb099923c@googlegroups.com> <503a0d51$0$6574$c3e8da3$5496439d@news.astraweb.com> <503a8361$0$6574$c3e8da3$5496439d@news.astraweb.com> <2e92da71-fbd2-467f-9088-1c79fa7bcf69@googlegroups.com> <62566024-df1d-4948-a27a-45c7820ddc6c@googlegroups.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 62.203.125.238 MIME-Version: 1.0 Subject: Re: Flexible string representation, unicode, typography, ... From: wxjmfauth@gmail.com To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org, wxjmfauth@gmail.com, d@davea.name X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 58 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346254994 news.xs4all.nl 6870 [2001:888:2000:d::a6]:48828 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:28067 Le mercredi 29 ao=FBt 2012 14:01:57 UTC+2, Dave Angel a =E9crit=A0: > On 08/29/2012 07:40 AM, wxjmfauth@gmail.com wrote: >=20 > > >=20 >=20 >=20 > > Forget Python and all these benchmarks. The problem is on an other >=20 > > level. Coding schemes, typography, usage of characters, ... For a >=20 > > given coding scheme, all code points/characters are equivalent. >=20 > > Expecting to handle a sub-range in a coding scheme without shaking >=20 > > that coding scheme is impossible. If a coding scheme does not give >=20 > > satisfaction, the only valid solution is to create a new coding >=20 > > scheme, cp1252, mac-roman, EBCDIC, ... or the interesting "TeX" case, >=20 > > where the "internal" coding depends on the fonts! Unicode (utf***), as >=20 > > just one another coding scheme, does not escape to this rule. This >=20 > > "Flexible String Representation" fails. Not only it is unable to stick >=20 > > with a coding scheme, it is a mixing of coding schemes, the worst of >=20 > > all possible implementations. jmf=20 >=20 >=20 >=20 > Nonsense. The discussion was not about an encoding scheme, but an >=20 > internal representation. That representation does not change the >=20 > programmer's interface in any way other than performance (cpu and memory >=20 > usage). Most of the rest of your babble is unsupported opinion. >=20 I can hit the nail a little more. I have even a better idea and I'm serious. If "Python" has found a new way to cover the set of the Unicode characters, why not proposing it to the Unicode consortium? Unicode has already three schemes covering practically all cases: memory consumption, maximum flexibility and an intermediate solution. It would be to bad, to not share it. What do you think? ;-) jmf