Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #53771
| Newsgroups | comp.lang.python |
|---|---|
| Date | 2013-09-06 02:11 -0700 |
| Message-ID | <4ce85ea8-4a4c-46cf-a546-ad999576a5f7@googlegroups.com> (permalink) |
| Subject | Chardet, file, ... and the Flexible String Representation |
| From | wxjmfauth@gmail.com |
Short comment about the "detection" tools from a previous discussion. The tools supposed to detect the coding scheme are all working with a simple logical mathematical rule: p ==> q <==> non q ==> non p . Shortly -- and consequence -- they do not detect a coding scheme they only detect "a" possible coding schme. The Flexible String Representation has conceptually to face the same problem. It splits "unicode" in chunks and it has to solve two problems at the same time, the coding and the handling of multiple "char sets". The problem? It fails. "This poor Flexible String Representation does not succeed to solve the problem it create itsself." Workaround: add more flags (see PEP 3xx.) Still thinking "mathematics" (limit). For a given repertoire of characters one can assume that every char has its own flag (because of the usage of multiple coding schemes). Conceptually, one will quickly realize, at the end, that they will be an equal amount of flags and an amount of characters and the only valid solution it to work with a unique set of encoded code points, where every element of this set *is* its own flag. Curiously, that's what the utf-* (and btw other coding schemes in the byte string world) are doing (with plenty of other advantages). Already said. An healthy coding scheme can only work with a unique set of encoded code points. That's why we have to live today with all these coding schemes. jmf
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-06 02:11 -0700
Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-09-06 10:57 +0000
Re: Chardet, file, ... and the Flexible String Representation Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-09-06 13:10 +0200
Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-06 07:02 -0400
Re: Chardet, file, ... and the Flexible String Representation Piet van Oostrum <piet@vanoostrum.org> - 2013-09-06 11:46 -0400
Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 02:04 +1000
Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-06 12:59 -0400
Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 03:04 +1000
Re: Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-09 07:28 -0700
Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-09 12:38 -0400
Re: Chardet, file, ... and the Flexible String Representation Michael Torrie <torriem@gmail.com> - 2013-09-09 11:05 -0600
Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve@pearwood.info> - 2013-09-10 04:58 +0000
Re: Chardet, file, ... and the Flexible String Representation Terry Reedy <tjreedy@udel.edu> - 2013-09-09 16:47 -0400
Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-10 11:36 -0400
Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 14:34 -0400
Re: Chardet, file, ... and the Flexible String Representation Ian Kelly <ian.g.kelly@gmail.com> - 2013-09-09 13:03 -0600
Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 15:27 -0400
Re: Chardet, file, ... and the Flexible String Representation Serhiy Storchaka <storchaka@gmail.com> - 2013-09-12 00:11 +0300
csiph-web