Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #48785
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <python@mrabarnett.plus.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.004 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'represents': 0.05; 'remaining': 0.07; '"if': 0.09; '128': 0.09; 'ascii': 0.09; 'bytes.': 0.09; 'required,': 0.09; 'subject:few': 0.09; 'yeah,': 0.09; 'python': 0.11; '127': 0.16; '65536': 0.16; 'bits.': 0.16; 'byte,': 0.16; 'character.': 0.16; 'charset,': 0.16; 'encodings': 0.16; 'encodings,': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:addr:python': 0.16; 'from:name:mrab': 0.16; 'ignoring': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.93': 0.16; 'received:84.93.230': 0.16; 'surrogate': 0.16; 'terribly': 0.16; 'unicode.': 0.16; 'thursday,': 0.16; 'wrote:': 0.18; 'wed,': 0.18; '>>>': 0.22; 'header:User-Agent:1': 0.23; 'bytes': 0.24; 'unicode': 0.24; "i've": 0.25; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'points': 0.29; 'characters': 0.30; 'needed.': 0.30; 'code': 0.31; 'usually': 0.31; '-0700,': 0.31; '13,': 0.31; 'bad.': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'up.': 0.33; 'johnson': 0.35; 'received:84': 0.35; 'but': 0.35; 'in.': 0.36; 'that!': 0.36; 'error.': 0.37; 'too': 0.37; 'two': 0.37; 'needed': 0.38; 'to:addr:python-list': 0.38; 'anything': 0.39; 'bad': 0.39; 'to:addr:python.org': 0.39; 'even': 0.60; 'easy': 0.60; 'most': 0.60; 'range': 0.61; 'skip:* 10': 0.61; "you're": 0.61; 'real': 0.63; 'more': 0.64; 'different': 0.65; 'by:': 0.65; 'due': 0.66; 'between': 0.67; 'header:Reply-To:1': 0.67; 'difficulty': 0.68; 'reply-to:no real name:2**0': 0.71; 'inclusive': 0.84; 'points,': 0.84; 'reply-to:addr:python.org': 0.84; 'two-': 0.84; 'rick': 0.93; '2013': 0.98 |
| X-CM-Score | 0.00 |
| X-CNFS-Analysis | v=2.1 cv=RZapVTdv c=1 sm=1 tr=0 a=0nF1XD0wxitMEM03M9B4ZQ==:117 a=0nF1XD0wxitMEM03M9B4ZQ==:17 a=0Bzu9jTXAAAA:8 a=oyR3mlnJdzkA:10 a=Ul6cpnYf0ckA:10 a=ihvODaAuJD4A:10 a=OUOv7kDek9cA:10 a=8nJEP1OIZ-IA:10 a=EBOSESyhAAAA:8 a=8AHkEIZyAAAA:8 a=ss88cSErNWgA:10 a=qikMrt14KvmBnc9ZhDAA:9 a=wPNLvfGTeEIA:10 |
| X-AUTH | mrabarnett:2500 |
| Date | Thu, 20 Jun 2013 12:43:28 +0100 |
| From | MRAB <python@mrabarnett.plus.com> |
| User-Agent | Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: A few questiosn about encoding |
| References | <6dfa3707-80f4-407a-a109-66dbb0130513@googlegroups.com> <mailman.2923.1370797972.3114.python-list@python.org> <kp9drh$1o0t$1@news.ntua.gr> <51b83e5a$0$29998$c3e8da3$5496439d@news.astraweb.com> <kp9lo6$9l5$2@news.ntua.gr> <51b90ead$0$29997$c3e8da3$5496439d@news.astraweb.com> <kpbnmg$qvk$2@news.ntua.gr> <51b9708b$0$29872$c3e8da3$5496439d@news.astraweb.com> <77ba6b16-4b1d-47a6-9b9b-5af45335c4fe@googlegroups.com> <51c2a089$0$29973$c3e8da3$5496439d@news.astraweb.com> |
| In-Reply-To | <51c2a089$0$29973$c3e8da3$5496439d@news.astraweb.com> |
| Content-Type | text/plain; charset=ISO-8859-1; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| Reply-To | python-list@python.org |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3620.1371728614.3114.python-list@python.org> (permalink) |
| Lines | 59 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1371728614 news.xs4all.nl 15920 [2001:888:2000:d::a6]:35683 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:48785 |
Show key headers only | View raw
On 20/06/2013 07:26, Steven D'Aprano wrote: > On Wed, 19 Jun 2013 18:46:59 -0700, Rick Johnson wrote: > >> On Thursday, June 13, 2013 2:11:08 AM UTC-5, Steven D'Aprano wrote: >> >>> Gah! That's twice I've screwed that up. Sorry about that! >> >> Yeah, and your difficulty explaining the Unicode implementation reminds >> me of a passage from the Python zen: >> >> "If the implementation is hard to explain, it's a bad idea." > > The *implementation* is easy to explain. It's the names of the encodings > which I get tangled up in. > You're off by one below! > > ASCII: Supports exactly 127 code points, each of which takes up exactly 7 > bits. Each code point represents a character. > 128 codepoints. > Latin-1, Latin-2, MacRoman, MacGreek, ISO-8859-7, Big5, Windows-1251, and > about a gazillion other legacy charsets, all of which are mutually > incompatible: supports anything from 127 to 65535 different code points, > usually under 256. > 128 to 65536 codepoints. > UCS-2: Supports exactly 65535 code points, each of which takes up exactly > two bytes. That's fewer than required, so it is obsoleted by: > 65536 codepoints. etc. > UTF-16: Supports all 1114111 code points in the Unicode charset, using a > variable-width system where the most popular characters use exactly two- > bytes and the remaining ones use a pair of characters. > > UCS-4: Supports exactly 4294967295 code points, each of which takes up > exactly four bytes. That is more than needed for the Unicode charset, so > this is obsoleted by: > > UTF-32: Supports all 1114111 code points, using exactly four bytes each. > Code points outside of the range 0 through 1114111 inclusive are an error. > > UTF-8: Supports all 1114111 code points, using a variable-width system > where popular ASCII characters require 1 byte, and others use 2, 3 or 4 > bytes as needed. > > > Ignoring the legacy charsets, only UTF-16 is a terribly complicated > implementation, due to the surrogate pairs. But even that is not too bad. > The real complication comes from the interactions between systems which > use different encodings, and that's nothing to do with Unicode. > >
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 00:13 +0000
Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 09:09 +0300
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 07:11 +0000
Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 10:42 +0300
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 17:58 +1000
Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 11:08 +0300
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-13 18:20 +1000
Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 12:41 +0300
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-13 11:49 +0000
Re: A few questiosn about encoding Νικόλαος Κούρας <support@superhost.gr> - 2013-06-13 17:19 +0300
Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 11:00 +1000
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 09:59 +0300
Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 20:14 +1000
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 16:58 +0300
Re: A few questiosn about encoding Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-14 11:21 -0400
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 18:26 +0300
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-15 03:03 +1000
Re: A few questiosn about encoding Walter Hurry <walterhurry@lavabit.com> - 2013-06-14 23:32 +0000
Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-15 10:26 +1000
Re: A few questiosn about encoding Denis McMahon <denismfmcmahon@gmail.com> - 2013-06-15 06:34 +0000
Re: A few questiosn about encoding Grant Edwards <invalid@invalid.invalid> - 2013-06-15 14:44 +0000
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 17:49 +0300
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-15 15:30 +0000
Re: A few questiosn about encoding Roy Smith <roy@panix.com> - 2013-06-15 10:59 -0400
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 18:14 +0300
Re: A few questiosn about encoding Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-15 11:35 -0400
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-15 22:26 +0300
Re: A few questiosn about encoding Benjamin Schollnick <benjamin@schollnick.net> - 2013-06-15 16:35 -0400
Re: A few questiosn about encoding Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-06-16 15:45 +0200
Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 09:36 +0200
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 10:49 +0300
Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 10:22 +0200
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 11:37 +0300
Don't feed the troll... (was: Re: A few questiosn about encoding) Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 11:06 +0200
Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 12:32 +0300
Re: Don't feed the troll... Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 13:09 +0200
Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:36 +0300
Re: Don't feed the troll... Joel Goldstick <joel.goldstick@gmail.com> - 2013-06-14 08:44 -0400
Re: Don't feed the troll... Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 15:25 +0200
Re: Don't feed the troll... Neil Cerutti <neilc@norwich.edu> - 2013-06-14 15:54 +0000
Re: Don't feed the troll... Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 12:15 +0200
Re: Don't feed the troll... Guy Scree <nobody@nowhere.com> - 2013-06-14 18:50 -0400
Re: Don't feed the troll... Denis McMahon <denismfmcmahon@gmail.com> - 2013-06-15 06:31 +0000
Re: Don't feed the troll... Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-06-15 13:04 -0400
Re: Don't feed the troll... Guy Scree <nobody@nowhere.com> - 2013-06-17 16:15 -0400
Re: Don't feed the troll... Chris Angelico <rosuav@gmail.com> - 2013-06-18 07:46 +1000
Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-14 20:19 +1000
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:41 +0300
Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Fábio Santos <fabiosantosart@gmail.com> - 2013-06-14 11:20 +0100
Re: Don't feed the troll... (was: Re: A few questiosn about encoding) rusi <rustompmody@gmail.com> - 2013-06-14 04:51 -0700
Re: Don't feed the help-vampire rusi <rustompmody@gmail.com> - 2013-06-14 05:09 -0700
Re: Don't feed the help-vampire Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 14:31 +0200
Re: Don't feed the help-vampire Ian Kelly <ian.g.kelly@gmail.com> - 2013-06-14 10:51 -0600
Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:50 +0300
Re: Don't feed the troll... Zero Piraeus <schesis@gmail.com> - 2013-06-14 09:33 -0400
Re: Don't feed the troll... Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:45 +0300
Re: Don't feed the troll... Heiko Wundram <modelnine@modelnine.org> - 2013-06-14 14:58 +0200
Re: Don't feed the troll... Fábio Santos <fabiosantosart@gmail.com> - 2013-06-14 14:25 +0100
Re: Don't feed the troll... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-14 17:12 +0100
Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 12:50 +0200
Re: A few questiosn about encoding Nick the Gr33k <support@superhost.gr> - 2013-06-14 15:59 +0300
Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-14 15:52 +0200
Re: A few questiosn about encoding Cameron Simpson <cs@zip.com.au> - 2013-06-15 10:28 +1000
Re: A few questiosn about encoding Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-06-17 08:49 +0200
Re: Don't feed the troll... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-14 12:57 +0100
Re: Don't feed the troll... (was: Re: A few questiosn about encoding) "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 13:13 -0400
Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Chris Angelico <rosuav@gmail.com> - 2013-06-15 03:31 +1000
Re: Don't feed the troll... (was: Re: A few questiosn about encoding) Grant Edwards <invalid@invalid.invalid> - 2013-06-14 19:40 +0000
Re: Don't feed the troll "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 13:56 -0400
Re: Don't feed the troll Tim Chase <python.list@tim.thechases.com> - 2013-06-14 14:00 -0500
Re: Don't feed the troll "D'Arcy J.M. Cain" <darcy@druid.net> - 2013-06-14 15:17 -0400
Re: Don't feed the troll... Ben Finney <ben+python@benfinney.id.au> - 2013-06-15 10:42 +1000
Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-19 18:46 -0700
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-20 06:26 +0000
Re: A few questiosn about encoding MRAB <python@mrabarnett.plus.com> - 2013-06-20 12:43 +0100
Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-20 09:27 -0700
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 02:37 +1000
Re: A few questiosn about encoding MRAB <python@mrabarnett.plus.com> - 2013-06-20 18:17 +0100
Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-23 08:51 -0700
Re: A few questiosn about encoding Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-23 16:30 +0000
Re: A few questiosn about encoding wxjmfauth@gmail.com - 2013-06-25 13:16 -0700
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 03:21 +1000
Re: A few questiosn about encoding Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-20 20:43 +0100
Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-20 06:40 -0700
Re: A few questiosn about encoding Andrew Berg <robotsondrugs@gmail.com> - 2013-06-20 09:04 -0500
Re: A few questiosn about encoding Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-20 08:12 -0700
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 01:26 +1000
Re: A few questiosn about encoding Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2013-06-20 20:25 +0300
Re: A few questiosn about encoding Chris Angelico <rosuav@gmail.com> - 2013-06-21 01:28 +1000
Re: A few questiosn about encoding Andreas Perstinger <andipersti@gmail.com> - 2013-06-20 19:08 +0200
csiph-web