Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63223

Re: "More About Unicode in Python 2 and 3"

From Terry Reedy <tjreedy@udel.edu>
Subject Re: "More About Unicode in Python 2 and 3"
Date 2014-01-05 16:10 -0500
References <lablra$1mc$2@ger.gmane.org>
Newsgroups comp.lang.python
Message-ID <mailman.4968.1388956223.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 1/5/2014 8:14 AM, Mark Lawrence wrote:
> http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/

I disagree with the following claims:

"Looking at that you can see that Python 3 removed something: support 
for non Unicode data text. "

I believe 2.7 str text methods like .upper only supported ascii. General 
non-unicode bytes text support would require an encoding as an attribute 
of the bytes text object. Python never had that.

"Python 3 essentially removed the byte-string type which in 2.x was 
called str."

Python 3 renamed unicode as str and str as bytes. Bytes have essentially 
all the text methods of 2.7 str. Compare dir(str) in 2.7 and dir(bytes) 
in 3.x. The main change of the class itself is that indexing and 
iteration yield ints i, 0 <= i < 256.

"all text operations now are only defined for Unicode strings."

?? Text methods are still defined on (ascii) bytes. It is true that one 
text operation -- string formatting no longer is (and there is an issue 
about that). But one is not all. There is also still discussion about 
within-class transforms, but they are still possible, even if not with 
the codecs module.

I suspect there are other basic errors, but I mostly quit reading at 
this point.

-- 
Terry Jan Reedy

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: "More About Unicode in Python 2 and 3" Terry Reedy <tjreedy@udel.edu> - 2014-01-05 16:10 -0500

csiph-web