Path: csiph.com!usenet.pasdenom.info!aioe.org!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Marko Rauhamaa Newsgroups: comp.lang.python Subject: Re: Python 3.2 has some deadly infection Date: Fri, 06 Jun 2014 20:11:02 +0300 Organization: A noiseless patient Spider Lines: 33 Message-ID: <871tv255g9.fsf@elektro.pacujo.net> References: <538C5BB8.1020702@chamonix.reportlab.co.uk> <538f1a61$0$29978$c3e8da3$5496439d@news.astraweb.com> <53902bb1$0$11109$c3e8da3@news.astraweb.com> <87wqcvu20h.fsf@elektro.pacujo.net> <7b3543f6-6f62-49c5-abdc-e2783fd6d629@googlegroups.com> <87oay7tnxt.fsf@elektro.pacujo.net> <87tx7z5hvw.fsf@elektro.pacujo.net> <87egz25dsd.fsf@elektro.pacujo.net> <87a99q5a08.fsf@elektro.pacujo.net> <5391e4fe$0$29988$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: mx05.eternal-september.org; posting-host="ff5cf27ef3d5b31f034d3b72bdc27a41"; logging-data="5294"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+TgAMgVgVeNsLViqjuya9M" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) Cancel-Lock: sha1:SInFPtMXN7M45oNTCjTgq5gk860= sha1:61mszhHjIOxyqiQtFX0jH+IJmV0= Xref: csiph.com comp.lang.python:72865 Steven D'Aprano : > On Fri, 06 Jun 2014 18:32:39 +0300, Marko Rauhamaa wrote: >> Unicode, like ASCII, is a code. Representing text in unicode is >> encoding. > > A Unicode string as an abstract data type has no encoding. Unicode itself is an encoding. See it in action here: 72 101 108 108 111 44 32 119 111 114 108 100 > It is a Platonic ideal, a pure form like the real numbers. Far from it. It is a mapping from symbols to integers. The symbols are the Platonic ones. The Unicode/ASCII encoding above represents the same "Platonic" string as this ESCDIC one: 212 133 147 147 150 107 64 166 150 153 137 132 > Unicode string like this: > > s = u"NOBODY expects the Spanish Inquisition!" > > should not be thought of as a bunch of bytes in some encoding, Encoding is not tied to bytes or even computers. People can speak in code, after all. Marko