Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #53771

Chardet, file, ... and the Flexible String Representation

Newsgroups comp.lang.python
Date 2013-09-06 02:11 -0700
Message-ID <4ce85ea8-4a4c-46cf-a546-ad999576a5f7@googlegroups.com> (permalink)
Subject Chardet, file, ... and the Flexible String Representation
From wxjmfauth@gmail.com

Show all headers | View raw


Short comment about the "detection" tools from a previous
discussion.

The tools supposed to detect the coding scheme are all
working with a simple logical mathematical rule:

p  ==> q    <==>   non q  ==> non p .

Shortly  -- and consequence  --  they do not detect a
coding scheme they only detect "a" possible coding schme.


The Flexible String Representation has conceptually to
face the same problem. It splits "unicode" in chunks and
it has to solve two problems at the same time, the coding
and the handling of multiple "char sets". The problem?
It fails.
"This poor Flexible String Representation does not succeed
to solve the problem it create itsself."

Workaround: add more flags (see PEP 3xx.)

Still thinking "mathematics" (limit). For a given repertoire
of characters one can assume that every char has its own
flag (because of the usage of multiple coding schemes).
Conceptually, one will quickly realize, at the end, that they
will be an equal amount of flags and an amount of characters
and the only valid solution it to work with a unique set of
encoded code points, where every element of this set *is*
its own flag.
Curiously, that's what the utf-* (and btw other coding schemes
in the byte string world) are doing (with plenty of other
advantages).

Already said. An healthy coding scheme can only work with
a unique set of encoded code points. That's why we have to
live today with all these coding schemes.

jmf

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-06 02:11 -0700
  Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-09-06 10:57 +0000
  Re: Chardet, file, ... and the Flexible String Representation Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-09-06 13:10 +0200
  Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-06 07:02 -0400
  Re: Chardet, file, ... and the Flexible String Representation Piet van Oostrum <piet@vanoostrum.org> - 2013-09-06 11:46 -0400
    Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 02:04 +1000
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-06 12:59 -0400
    Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 03:04 +1000
    Re: Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-09 07:28 -0700
      Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-09 12:38 -0400
      Re: Chardet, file, ... and the Flexible String Representation Michael Torrie <torriem@gmail.com> - 2013-09-09 11:05 -0600
        Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve@pearwood.info> - 2013-09-10 04:58 +0000
      Re: Chardet, file, ... and the Flexible String Representation Terry Reedy <tjreedy@udel.edu> - 2013-09-09 16:47 -0400
      Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-10 11:36 -0400
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 14:34 -0400
    Re: Chardet, file, ... and the Flexible String Representation Ian Kelly <ian.g.kelly@gmail.com> - 2013-09-09 13:03 -0600
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 15:27 -0400
    Re: Chardet, file, ... and the Flexible String Representation Serhiy Storchaka <storchaka@gmail.com> - 2013-09-12 00:11 +0300

csiph-web