Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #50164

Re: hex dump w/ or w/out utf-8 chars

Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.007
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'guido': 0.05; 'utf-8': 0.07; 'forcing': 0.09; 'hiding': 0.09; 'imported': 0.09; 'methods,': 0.09; 'oop': 0.09; 'oop,': 0.09; 'url:unicode': 0.09; 'python': 0.11; '2.7': 0.14; '3.3,': 0.16; 'ascii,': 0.16; 'cleanly': 0.16; 'clear.': 0.16; 'codecs': 0.16; 'different,': 0.16; 'effect,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'italian': 0.16; "object's": 0.16; 'range,': 0.16; 'sequential': 0.16; 'unicode.': 0.16; 'ignore': 0.16; 'wrote:': 0.18; 'programming': 0.22; '(in': 0.22; 'byte': 0.24; 'non': 0.24; 'appreciated': 0.26; 'pass': 0.26; 'subject:/': 0.26; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'characters': 0.30; 'moved': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'probably': 0.32; 'everyone': 0.33; 'style': 0.33; "i'd": 0.34; 'could': 0.34; 'classes': 0.35; 'convert': 0.35; 'objects': 0.35; 'point.': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'add': 0.35; 'there': 0.35; 'really': 0.36; 'functions.': 0.36; 'should': 0.36; 'too': 0.37; 'to:addr:python-list': 0.38; 'moving': 0.39; 'supporting': 0.39; 'url:2012': 0.39; 'to:addr:python.org': 0.39; 'skip:u 10': 0.60; 'read': 0.60; 'future': 0.60; 'lost': 0.61; 'range': 0.61; "you're": 0.61; 'different': 0.65; 'between': 0.67; 'chinese': 0.74; 'jul': 0.74; 'ending': 0.78; 'url:wordpress': 0.78; '3.4': 0.84; 'articles:': 0.84; 'bmp,': 0.84; 'releases.': 0.91; 'differences': 0.93; 'wanting': 0.93; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ts7Z6E/WjdvxMlWKpA1sLv7iETUsvNR9Mtiwdv4D2Jo=; b=bFRnih+hEvz4fgIYJrCIg64/QKOHZnR9/U0C9gm1mqaUrMZyj9nVL50tKTMeeRpRmk yaK8A+75xUTxMBSVNsMCb+PbMCoucBaq8LSCbV0zA4TnaVaJ1G8lenlvrtaPAY/xh9jj kpoCjGDgGFJsXfA0JW83r4OWm8gkleXPzJDgzbYZuNelE1X4dhWhkM6bg/kDqdOoFh17 y1xmKIDAf7scNTVDqzEUPdzzSUt0Kb0FJJVD44fbFl9ZKBb6BlwKHoP22BTuQMqw5ur3 ZORS9N7hd4Nvy9MIlWHKRK1n3GL2wetY2zQBX3gB2AIEc6aszfrjnu4UmUrGQi1UQHcd r8GA==
MIME-Version 1.0
X-Received by 10.220.128.72 with SMTP id j8mr14347151vcs.3.1373305937545; Mon, 08 Jul 2013 10:52:17 -0700 (PDT)
In-Reply-To <7ef8c0e7-7f7c-4a22-89a9-50f62c4a8064@googlegroups.com>
References <a35609c1-e56f-4180-8176-4405264da0a2@googlegroups.com> <7ef8c0e7-7f7c-4a22-89a9-50f62c4a8064@googlegroups.com>
Date Tue, 9 Jul 2013 03:52:17 +1000
Subject Re: hex dump w/ or w/out utf-8 chars
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4391.1373305945.3114.python-list@python.org> (permalink)
Lines 36
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1373305945 news.xs4all.nl 15919 [2001:888:2000:d::a6]:33743
X-Complaints-To abuse@xs4all.nl
Path csiph.com!usenet.pasdenom.info!news.franciliens.net!feed.ac-versailles.fr!nerim.net!novso.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Xref csiph.com comp.lang.python:50164

Show key headers only | View raw


On Tue, Jul 9, 2013 at 3:31 AM,  <ferdy.blatsco@gmail.com> wrote:
> Unfortunately (as probably I told you before) I will never pass to
> Python 3...  Guido should not always listen only to gurus like him...
> I don't like Python as before...starting from OOP and ending with codecs
> like utf-8. Regarding OOP, much appreciated expecially by experts, he
> could use python 2 for hiding the complexities of OOP (improving, as an
> effect, object's code hiding) moving classes and objects to
> imported methods, leaving in this way the programming style to the
> well known old style: sequential programming and functions.
> About utf-8... the same solution: keep utf-8 but for the non experts, add
> methods to convert to solutions which use the range 128-255 of only one
> byte (I do not give a damn about chinese and "similia"!...)
> I know that is a lost battle (in italian "una battaglia persa")!

Well, there won't be a Python 2.8, so you really should consider
moving at some point. Python 3.3 is already way better than 2.7 in
many ways, 3.4 will improve on 3.3, and the future is pretty clear.
But nobody's forcing you, and 2.7.x will continue to get
bugfix/security releases for a while. (Personally, I'd be happy if
everyone moved off the 2.3/2.4 releases. It's not too hard supporting
2.6+ or 2.7+.)

The thing is, you're thinking about UTF-8, but you should be thinking
about Unicode. I recommend you read these articles:

http://www.joelonsoftware.com/articles/Unicode.html
http://unspecified.wordpress.com/2012/04/19/the-importance-of-language-level-abstract-unicode-strings/

So long as you are thinking about different groups of characters as
different, and wanting a solution that maps characters down into the
<256 range, you will never be able to cleanly internationalize. With
Python 3.3+, you can ignore the differences between ASCII, BMP, and
SMP characters; they're all just "characters". Everything works
perfectly with Unicode.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

hex dump w/ or w/out utf-8 chars blatt <ferdy.blatsco@gmail.com> - 2013-07-07 17:22 -0700
  Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-08 11:17 +1000
  Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-08 05:48 +0000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:31 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 03:52 +1000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 06:18 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-11 23:32 +1000
          Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:42 -0700
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:44 -0700
            Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-12 03:18 +0000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-12 14:42 -0700
            Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-12 12:16 +1000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 00:56 -0700
                Re: hex dump w/ or w/out utf-8 chars Lele Gaifax <lele@metapensiero.it> - 2013-07-13 10:24 +0200
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:36 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 19:46 +1000
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:49 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 20:09 +1000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 07:37 -0700
                Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-13 15:02 -0400
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 01:20 -0700
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-14 10:44 +0000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 06:44 -0700
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-24 06:28 -0700
                Re: hex dump w/ or w/out utf-8 chars Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-14 09:17 +1000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:53 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 04:07 +1000
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 16:56 -0400
      Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 12:22 +0000
        Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 08:54 -0400
          Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 13:00 +0000
            Re: hex dump w/ or w/out utf-8 chars Skip Montanaro <skip@pobox.com> - 2013-07-09 08:18 -0500
            Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 09:23 -0400
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-08 22:38 +0100
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 07:49 +1000
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:53 +0000
    Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-08 23:02 +0100
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 18:45 -0400
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 08:51 +1000
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-09 00:32 +0100
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:46 +0000
    Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 07:00 +0000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-09 02:34 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-07-09 12:15 +0200
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 16:32 +0000
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-10 01:52 -0700
        Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua@landau.ws> - 2013-07-12 23:01 +0100
          Re: hex dump w/ or w/out utf-8 chars Tim Roberts <timr@probo.com> - 2013-07-12 20:42 -0700
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 04:51 +0000

csiph-web