Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #50167
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.005 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'programmer': 0.03; 'utf-8': 0.07; "'a'": 0.09; 'badly': 0.09; 'character,': 0.09; 'correspond': 0.09; 'differently.': 0.09; 'python': 0.11; '"a"': 0.16; 'any.': 0.16; 'assembler': 0.16; 'character.': 0.16; 'encodings': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'interacting': 0.16; 'magic': 0.16; 'mapped': 0.16; 'on)': 0.16; 'wrote:': 0.18; 'hacking': 0.19; '>>>': 0.22; 'code,': 0.22; 'bytes': 0.24; 'subject:/': 0.26; 'defined': 0.27; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'am,': 0.29; 'characters': 0.30; 'message-id:@mail.gmail.com': 0.30; 'becoming': 0.31; 'pascal': 0.31; 'languages': 0.32; 'stuff': 0.32; 'beginning': 0.33; 'equal': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'science,': 0.36; 'transition': 0.36; 'arrange': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'even': 0.60; 'then,': 0.60; 'world.': 0.61; "you're": 0.61; 'back': 0.62; 'more': 0.64; 'between': 0.67; 'jul': 0.74; 'special': 0.74; 'characters,': 0.84; 'different.': 0.84; 'treatment': 0.95; '2013': 0.98 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=IvZc2WNZIM1o70veV1QxC3bUiEHz8cM/cw6+lb3Un5k=; b=MRa8fAstREKbNJ8KNDvYl/cVZIrmyvQFIFN/MroIBf1XT8ej6QcoI+sepStsvY82al yIISH9QKlgik9ZlG4iDPrsbX476tkjRR0AdhbWz9d4/CuyNBrFGG7IU/DVk5BmxY/8Lv M5ojPfvFLNN25YrslOvYTKfsz5x/QPeryh4ppeRjWUv05jsBmAf0+7jfVax3Ay8EuWdU YmMHgJtM5due5ek5M7ZOtXdZaCbAf2wevujpvZtM6/jrv/aQ72Sx0Lbwc3XAP+Aaqqp+ gopxzYU48Xj/eI/GEm4cAOXegSDoAWRAQVUDWVCwJrRnMo7itqc5fCZSgGFMb5sJx52P ZZVg== |
| MIME-Version | 1.0 |
| X-Received | by 10.58.223.238 with SMTP id qx14mr14237718vec.98.1373306842465; Mon, 08 Jul 2013 11:07:22 -0700 (PDT) |
| In-Reply-To | <7b6fc645-8bf3-4681-821c-38fb1fa1d191@googlegroups.com> |
| References | <a35609c1-e56f-4180-8176-4405264da0a2@googlegroups.com> <7b6fc645-8bf3-4681-821c-38fb1fa1d191@googlegroups.com> |
| Date | Tue, 9 Jul 2013 04:07:22 +1000 |
| Subject | Re: hex dump w/ or w/out utf-8 chars |
| From | Chris Angelico <rosuav@gmail.com> |
| To | python-list@python.org |
| Content-Type | text/plain; charset=UTF-8 |
| Content-Transfer-Encoding | quoted-printable |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.4393.1373306845.3114.python-list@python.org> (permalink) |
| Lines | 23 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1373306845 news.xs4all.nl 15957 [2001:888:2000:d::a6]:44223 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:50167 |
Show key headers only | View raw
On Tue, Jul 9, 2013 at 3:53 AM, <ferdy.blatsco@gmail.com> wrote: >>> All characters are UTF-8, characters. "a" is a UTF-8 character. So is "ă". > Not using python 3, for me (a programmer which was present at the beginning of > computer science, badly interacting with many languages from assembler to > Fortran and from c to Pascal and so on) it was an hard job to arrange the > abrupt transition from characters only equal to bytes to some special > characters defined with 2, 3 bytes and even more. Even back then, bytes and characters were different. 'A' is a character, 0x41 is a byte. And they correspond 1:1 if and only if you know that your characters are represented in ASCII. Other encodings (eg EBCDIC) mapped things differently. The only difference now is that more people are becoming aware that there are more than 256 characters in the world. Like Magic 2014 and its treatment of Slivers, at some point you're going to have to master the difference between bytes and characters, or else be eternally hacking around stuff in your code, so now is as good a time as any. ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
hex dump w/ or w/out utf-8 chars blatt <ferdy.blatsco@gmail.com> - 2013-07-07 17:22 -0700
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-08 11:17 +1000
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-08 05:48 +0000
Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:31 -0700
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 03:52 +1000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 06:18 -0700
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-11 23:32 +1000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:42 -0700
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:44 -0700
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-12 03:18 +0000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-12 14:42 -0700
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-12 12:16 +1000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 00:56 -0700
Re: hex dump w/ or w/out utf-8 chars Lele Gaifax <lele@metapensiero.it> - 2013-07-13 10:24 +0200
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:36 +0000
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 19:46 +1000
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:49 +0000
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 20:09 +1000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 07:37 -0700
Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-13 15:02 -0400
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 01:20 -0700
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-14 10:44 +0000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 06:44 -0700
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-24 06:28 -0700
Re: hex dump w/ or w/out utf-8 chars Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-14 09:17 +1000
Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:53 -0700
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 04:07 +1000
Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 16:56 -0400
Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 12:22 +0000
Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 08:54 -0400
Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 13:00 +0000
Re: hex dump w/ or w/out utf-8 chars Skip Montanaro <skip@pobox.com> - 2013-07-09 08:18 -0500
Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 09:23 -0400
Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-08 22:38 +0100
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 07:49 +1000
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:53 +0000
Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-08 23:02 +0100
Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 18:45 -0400
Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 08:51 +1000
Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-09 00:32 +0100
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:46 +0000
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 07:00 +0000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-09 02:34 -0700
Re: hex dump w/ or w/out utf-8 chars Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-07-09 12:15 +0200
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 16:32 +0000
Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-10 01:52 -0700
Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua@landau.ws> - 2013-07-12 23:01 +0100
Re: hex dump w/ or w/out utf-8 chars Tim Roberts <timr@probo.com> - 2013-07-12 20:42 -0700
Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 04:51 +0000
csiph-web