Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #53796

Re: Chardet, file, ... and the Flexible String Representation

From random832@fastmail.us
Subject Re: Chardet, file, ... and the Flexible String Representation
Date 2013-09-06 12:59 -0400
References <4ce85ea8-4a4c-46cf-a546-ad999576a5f7@googlegroups.com> <m2a9jqq7g9.fsf@cochabamba.vanoostrum.org>
Newsgroups comp.lang.python
Message-ID <mailman.128.1378486750.5461.python-list@python.org> (permalink)

Show all headers | View raw


On Fri, Sep 6, 2013, at 11:46, Piet van Oostrum wrote:
> The FSR does not split unicode in chuncks. It does not create problems
> and therefore it doesn't have to solve this. 
> 
> The FSR simply stores a Unicode string as an array[*] of ints (the
> Unicode code points of the characters of the string. That's it. Then it
> uses a memory-efficient way to store this array of ints. But that has
> nothing to do with character sets. The same principle could be used for
> any array of ints.

I think the source of the confusion is that it is described in terms of
UCS-2 and Latin-1, which people often think of (especially latin-1) as
different encodings rather than merely storing code points in a narrower
type.

----

Incidentally, how does all this interact with ctypes unicode_buffers,
which slice as strings and must be UTF-16 on windows? This was fine
pre-FSR when unicode objects were UTF-16, but I'm not sure how it would
work now.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-06 02:11 -0700
  Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-09-06 10:57 +0000
  Re: Chardet, file, ... and the Flexible String Representation Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-09-06 13:10 +0200
  Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-06 07:02 -0400
  Re: Chardet, file, ... and the Flexible String Representation Piet van Oostrum <piet@vanoostrum.org> - 2013-09-06 11:46 -0400
    Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 02:04 +1000
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-06 12:59 -0400
    Re: Chardet, file, ... and the Flexible String Representation Chris Angelico <rosuav@gmail.com> - 2013-09-07 03:04 +1000
    Re: Chardet, file, ... and the Flexible String Representation wxjmfauth@gmail.com - 2013-09-09 07:28 -0700
      Re: Chardet, file, ... and the Flexible String Representation Ned Batchelder <ned@nedbatchelder.com> - 2013-09-09 12:38 -0400
      Re: Chardet, file, ... and the Flexible String Representation Michael Torrie <torriem@gmail.com> - 2013-09-09 11:05 -0600
        Re: Chardet, file, ... and the Flexible String Representation Steven D'Aprano <steve@pearwood.info> - 2013-09-10 04:58 +0000
      Re: Chardet, file, ... and the Flexible String Representation Terry Reedy <tjreedy@udel.edu> - 2013-09-09 16:47 -0400
      Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-10 11:36 -0400
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 14:34 -0400
    Re: Chardet, file, ... and the Flexible String Representation Ian Kelly <ian.g.kelly@gmail.com> - 2013-09-09 13:03 -0600
    Re: Chardet, file, ... and the Flexible String Representation random832@fastmail.us - 2013-09-09 15:27 -0400
    Re: Chardet, file, ... and the Flexible String Representation Serhiy Storchaka <storchaka@gmail.com> - 2013-09-12 00:11 +0300

csiph-web