Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!feeder.erje.net!eu.feeder.erje.net!news-1.dfn.de!news.dfn.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Neil Cerutti Newsgroups: comp.lang.python Subject: Re: python 3.3 repr Date: 15 Nov 2013 17:47:01 GMT Organization: Norwich University Lines: 20 Message-ID: References: <0d383a3c-247f-4b6a-9a18-7e7fadeb6047@googlegroups.com> <52864018.9020205@chamonix.reportlab.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: individual.net g4E5+0ai6xhV7fCP1meC8QA0RD4ItzdB567mqpVsxzxQMV2h6Z Cancel-Lock: sha1:XA6/JN3aKBLu1FVgEpsYXKApEcI= User-Agent: slrn/0.9.9p1/mm/ao (Win32) Xref: csiph.com comp.lang.python:59556 On 2013-11-15, Chris Angelico wrote: > Other languages _have_ gone for at least some sort of Unicode > support. Unfortunately quite a few have done a half-way job and > use UTF-16 as their internal representation. That means there's > no difference between U+0012, U+0123, and U+1234, but U+12345 > suddenly gets handled differently. ECMAScript actually > specifies the perverse behaviour of treating codepoints >U+FFFF > as two elements in a string, because it's just too costly to > change. The unicode support I'm learning in Go is, "Everything is utf-8, right? RIGHT?!?" It also has the interesting behavior that indexing strings retrieves bytes, while iterating over them results in a sequence of runes. It comes with support for no encodings save utf-8 (natively) and utf-16 (if you work at it). Is that really enough? -- Neil Cerutti