Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #64026

Re: 'Straße' ('Strasse') and Python 2

From Terry Reedy <tjreedy@udel.edu>
Subject Re: 'Straße' ('Strasse') and Python 2
Date 2014-01-15 19:27 -0500
References (1 earlier) <52D67873.2010502@chamonix.reportlab.co.uk> <lb5u13$9hs$1@ger.gmane.org> <52D68402.6030407@chamonix.reportlab.co.uk> <570A55FE-6640-4480-BF9A-3C2F8B9C1DC5@gmail.com> <52D6BD83.9000804@chamonix.reportlab.co.uk>
Newsgroups comp.lang.python
Message-ID <mailman.5551.1389832076.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 1/15/2014 11:55 AM, Robin Becker wrote:

> The fact that unicoders want to take over the meaning of encoding is not
> relevant.

I agree with you that 'encoding' should not be limited to 'byte encoding 
of a (subset of) unicode characters. For instance, .jpg and .png are 
byte encodings of images. In the other hand, it is common in human 
discourse to omit qualifiers in particular contexts. 'Computer virus' 
gets condensed to 'virus' in computer contexts.

The problem with graphemes is that there is no fixed set of unicode 
graphemes. Which is to say, the effective set of graphemes is 
context-specific. Just limiting ourselves to English, 'fi' is usually 2 
graphemes when printing to screen, but often just one when printing to 
paper. This is why the Unicode consortium punted 'graphemes' to 
'application' code.

> I'm not anti unicode, that's just an assignment of identity to some
> symbols. Coding the values of the ids is a separate issue. It's my
> belief that we don't need more than the byte level encoding to represent
> unicode. One of the claims made for python3 unicode is that it somehow
> eliminates the problems associated with other encodings eg utf8,

The claim is true for the following problems of the way-too-numerous 
unicode byte encodings.

Subseting: only a subset of characters can be encoded.

Shifting: the meaning of a byte depends on a preceding shift character, 
which might be back as the beginning of the sequence.

Varying size: the number of bytes to encode a character depends on the 
character.

Both of the last two problems can turn O(1) operations into O(n) 
operations. 3.3+ eliminates all these problems.

-- 
Terry Jan Reedy

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-11 23:50 -0800
  Re: 'Straße' ('Strasse') and Python 2 Peter Otten <__peter__@web.de> - 2014-01-12 09:31 +0100
  Re: 'Straße' ('Strasse') and Python 2 Stefan Behnel <stefan_ml@behnel.de> - 2014-01-12 10:00 +0100
  Re: 'Straße' ('Strasse') and Python 2 Ned Batchelder <ned@nedbatchelder.com> - 2014-01-12 07:17 -0500
  Re: 'Straße' ('Strasse') and Python 2 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-12 12:33 +0000
  Re: 'Straße' ('Strasse') and Python 2 MRAB <python@mrabarnett.plus.com> - 2014-01-12 18:33 +0000
  Re: 'Straße' ('Strasse') and Python 2 Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-01-13 09:27 +0100
    Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-13 01:54 -0800
      Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-13 21:26 +1100
      Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-13 10:38 +0000
        Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-13 21:57 +1100
          Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-13 08:24 -0800
            Re: 'Straße' ('Strasse') and Python 2 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-13 17:02 +0000
      Re: 'Straße' ('Strasse') and Python 2 Michael Torrie <torriem@gmail.com> - 2014-01-13 08:58 -0700
      Re: 'Straße' ('Strasse') and Python 2 Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-01-13 19:37 +0100
      Mistake or Troll (was Re: 'Straße' ('Strasse') and Python 2) Terry Reedy <tjreedy@udel.edu> - 2014-01-13 18:05 -0500
  Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 12:00 +0000
    Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 00:43 +0000
      Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 12:26 +1100
  Re: 'Straße' ('Strasse') and Python 2 Ned Batchelder <ned@nedbatchelder.com> - 2014-01-15 07:13 -0500
    Re: 'Straße' ('Strasse') and Python 2 wxjmfauth@gmail.com - 2014-01-15 06:55 -0800
      Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 02:14 +1100
        Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 00:32 +0000
          Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-16 10:51 +0000
            Re: 'Straße' ('Strasse') and Python 2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 14:07 +0000
              Re: 'Straße' ('Strasse') and Python 2 Tim Chase <python.list@tim.thechases.com> - 2014-01-16 09:24 -0600
          Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 21:58 +1100
          Re: 'StraÃYe' ('Strasse') and Python 2 "Frank Millman" <frank@chagford.com> - 2014-01-16 14:06 +0200
          Re: 'StraÃYe' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-16 13:03 +0000
          Re: 'Straße' ('Strasse') and Python 2 Travis Griggs <travisgriggs@gmail.com> - 2014-01-16 13:30 -0800
  Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 12:50 +0000
  Re: 'Straße' ('Strasse') and Python 2 Travis Griggs <travisgriggs@gmail.com> - 2014-01-15 08:28 -0800
  Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 16:55 +0000
  Re: 'Straße' ('Strasse') and Python 2 Chris Angelico <rosuav@gmail.com> - 2014-01-16 04:14 +1100
  Re: 'Straße' ('Strasse') and Python 2 Robin Becker <robin@reportlab.com> - 2014-01-15 17:28 +0000
  Re: 'Straße' ('Strasse') and Python 2 Ian Kelly <ian.g.kelly@gmail.com> - 2014-01-15 11:32 -0700
  Re: 'Straße' ('Strasse') and Python 2 Terry Reedy <tjreedy@udel.edu> - 2014-01-15 19:27 -0500

csiph-web