Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #71451

Re: Everything you did not want to know about Unicode in Python 3

Path csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.013
X-Spam-Evidence '*H*': 0.97; '*S*': 0.00; 'subject:not': 0.03; 'encoding': 0.05; 'subject:Python': 0.06; 'binary': 0.07; 'utf-8': 0.07; 'ambiguity': 0.09; 'strings.': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'hell.': 0.16; 'subject:Unicode': 0.16; 'text?': 0.16; 'files.': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'app': 0.19; 'meant': 0.20; 'saying': 0.22; 'cc:addr:python.org': 0.22; 'bytes': 0.24; 'text.': 0.24; 'unicode': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'idea': 0.28; 'xml': 0.29; "doesn't": 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'work.': 0.31; 'went': 0.31; '13,': 0.31; 'convenience': 0.31; 'safely': 0.31; 'allows': 0.31; 'file': 0.32; 'stuff': 0.32; 'text': 0.33; 'standards': 0.33; 'trouble': 0.34; "can't": 0.35; 'something': 0.35; 'computing': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'should': 0.36; 'example,': 0.37; 'problems': 0.38; 'pm,': 0.38; 'does': 0.39; 'sure': 0.39; 'then,': 0.60; 'full': 0.61; 'back': 0.62; 'production': 0.68; 'obvious': 0.74; '(network': 0.84; 'freaky': 0.84; 'subject:know': 0.84; 'subject:you': 0.87; 'dealt': 0.91; 'subject:want': 0.91; 'suspicious': 0.91; 'thing,': 0.91; 'to:none': 0.92; 'tough': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=CW8SufNxlonlx2ohukHJoJ7LXpFgTKtTvZK0j9/c7Qo=; b=HI/fIr4aVHv1tNAmEO8gjj1S1Tcj/pLIbBMlV3T+Fi4dqcaOoW01Gcxvx/KE5LwbkN inNrvtDJvhwi3rfCy0qE/t2QYY8iyEF/s+lkKYVCiLDh2xWqVxm4vi35r8kp+WDW37TC P9NAUYZhawCOi+Aqqxo9XOc3SWjQIvgJnCTy3o29tP/zhBCYyefa2bqAyEwX4y56hwjr NYv3REeD6ZaPzuR/Wclx6AOxFyg/FqPRrudpCvZRksmL0tUmUGeyaeWQ+29exD9hgUB8 4eqsm/FZGgaDWcyr3GWd1kMcjnyRK5hR0TAtNJoUv2MoXe1jCykQmgPWXDasSEH7xmJv JtuA==
MIME-Version 1.0
X-Received by 10.52.11.230 with SMTP id t6mr883258vdb.27.1399970283204; Tue, 13 May 2014 01:38:03 -0700 (PDT)
In-Reply-To <87tx8uccgd.fsf@elektro.pacujo.net>
References <mailman.9915.1399907977.18130.python-list@python.org> <8P7cv.78617$Sp6.8377@fx15.am4> <537172eb$0$29980$c3e8da3$5496439d@news.astraweb.com> <lksju6$amr$1@news.albasani.net> <87tx8uccgd.fsf@elektro.pacujo.net>
Date Tue, 13 May 2014 18:38:03 +1000
Subject Re: Everything you did not want to know about Unicode in Python 3
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.9951.1399970292.18130.python-list@python.org> (permalink)
Lines 31
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1399970292 news.xs4all.nl 2839 [2001:888:2000:d::a6]:51919
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:71451

Show key headers only | View raw


On Tue, May 13, 2014 at 6:25 PM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Johannes Bauer <dfnsonfsduifb@gmx.de>:
>
>> Having dealt with the UTF-8 problems on Python2 I can safely say that
>> I never, never ever want to go back to that freaky hell. If I deal
>> with strings, I want to be able to sanely manipulate them and I want
>> to be sure that after manipulation they're still valid strings.
>> Manipulating the bytes representation of unicode data just doesn't
>> work.
>
> Based on my background (network and system programming), I'm a bit
> suspicious of strings, that is, text. For example, is the stuff that
> goes to syslog bytes or text? Does an XML file contain bytes or
> (encoded) text? The answers are not obvious to me. Modern computing is
> full of ASCII-esque binary communication standards and formats.

These are problems that Unicode can't solve. In theory, XML should
contain text in a known encoding (defaulting to UTF-8). With syslog,
it's problematic - I don't remember what it's meant to be, but I know
there are issues. Same with other log files.

> Python 2's ambiguity allows me not to answer the tough philosophical
> questions. I'm not saying it's necessarily a good thing, but it has its
> benefits.

It's not a good thing. It means that you have the convenience of
pretending there's no problem, which means you don't notice trouble
until something happens... and then, in all probability, your app is
in production and you have no idea why stuff went wrong.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Everything you did not want to know about Unicode in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-12 16:19 +0100
  Re: Everything you did not want to know about Unicode in Python 3 alister <alister.nospam.ware@ntlworld.com> - 2014-05-12 17:47 +0000
    Re: Everything you did not want to know about Unicode in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2014-05-12 12:31 -0600
    Re: Everything you did not want to know about Unicode in Python 3 MRAB <python@mrabarnett.plus.com> - 2014-05-12 20:42 +0100
    Re: Everything you did not want to know about Unicode in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2014-05-12 16:16 -0600
    Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 09:42 +1000
    Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-13 01:18 +0000
      Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 11:39 +1000
        Re: Everything you did not want to know about Unicode in Python 3 alex23 <wuwei23@gmail.com> - 2014-05-13 16:25 +1000
          Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 16:32 +1000
      Re: Everything you did not want to know about Unicode in Python 3 Mark H Harris <harrismh777@gmail.com> - 2014-05-12 20:58 -0500
      Re: Everything you did not want to know about Unicode in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-13 03:33 +0100
      Re: Everything you did not want to know about Unicode in Python 3 Rustom Mody <rustompmody@gmail.com> - 2014-05-12 22:10 -0700
        Re: Everything you did not want to know about Unicode in Python 3 Mark H Harris <harrismh777@gmail.com> - 2014-05-13 00:39 -0500
          Re: Everything you did not want to know about Unicode in Python 3 Gene Heskett <gheskett@wdtv.com> - 2014-05-13 01:45 -0400
          Re: Everything you did not want to know about Unicode in Python 3 Ben Finney <ben@benfinney.id.au> - 2014-05-13 16:03 +1000
          Re: Everything you did not want to know about Unicode in Python 3 Rustom Mody <rustompmody@gmail.com> - 2014-05-12 23:09 -0700
          Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 16:18 +1000
            Re: Everything you did not want to know about Unicode in Python 3 Mark H Harris <harrismh777@gmail.com> - 2014-05-13 01:32 -0500
            Re: Everything you did not want to know about Unicode in Python 3 Mark H Harris <harrismh777@gmail.com> - 2014-05-13 01:32 -0500
            Re: Everything you did not want to know about Unicode in Python 3 Roy Smith <roy@panix.com> - 2014-05-13 07:20 -0400
              Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-13 13:39 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 23:43 +1000
                Re: Everything you did not want to know about Unicode in Python 3 Rustom Mody <rustompmody@gmail.com> - 2014-05-13 07:30 -0700
                Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-14 00:36 +1000
                Re: Everything you did not want to know about Unicode in Python 3 Grant Edwards <invalid@invalid.invalid> - 2014-05-13 13:51 +0000
                Re: Everything you did not want to know about Unicode in Python 3 alister <alister.nospam.ware@ntlworld.com> - 2014-05-13 14:42 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Grant Edwards <invalid@invalid.invalid> - 2014-05-13 15:21 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-13 23:53 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-14 10:08 +1000
                Re: Everything you did not want to know about Unicode in Python 3 alister <alister.nospam.ware@ntlworld.com> - 2014-05-14 12:42 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-14 22:52 +1000
                Re: Everything you did not want to know about Unicode in Python 3 Grant Edwards <invalid@invalid.invalid> - 2014-05-16 14:46 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-17 01:07 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Marko Rauhamaa <marko@pacujo.net> - 2014-05-17 07:19 +0300
                Re: Everything you did not want to know about Unicode in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-17 09:35 +0100
                Re: Everything you did not want to know about Unicode in Python 3 Robert Kern <robert.kern@gmail.com> - 2014-05-17 10:29 +0100
                Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-17 14:15 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Robert Kern <robert.kern@gmail.com> - 2014-05-17 22:01 +0100
                Re: Everything you did not want to know about Unicode in Python 3 Robert Kern <robert.kern@gmail.com> - 2014-05-17 09:57 +0100
                Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-17 12:07 +0000
                Re: Everything you did not want to know about Unicode in Python 3 Robert Kern <robert.kern@gmail.com> - 2014-05-17 22:07 +0100
                Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-17 19:18 +1000
                Re: Everything you did not want to know about Unicode in Python 3 Ben Finney <ben@benfinney.id.au> - 2014-05-17 21:05 +1000
                [OT] Copyright statements and why they can be useful (was: Everything you did not want to know about Unicode in Python 3) Ben Finney <ben@benfinney.id.au> - 2014-05-14 11:01 +1000
                Re: Everything you did not want to know about Unicode in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2014-05-14 09:07 -0600
                Re: Everything you did not want to know about Unicode in Python 3 Dave Angel <davea@davea.name> - 2014-05-13 21:56 -0400
            Re: Everything you did not want to know about Unicode in Python 3 Grant Edwards <invalid@invalid.invalid> - 2014-05-13 13:49 +0000
      Re: Everything you did not want to know about Unicode in Python 3 gregor <gregor@ediwo.com> - 2014-05-13 09:27 +0200
      Re: Everything you did not want to know about Unicode in Python 3 Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-05-13 10:08 +0200
        Re: Everything you did not want to know about Unicode in Python 3 Marko Rauhamaa <marko@pacujo.net> - 2014-05-13 11:25 +0300
          Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 18:38 +1000
            Re: Everything you did not want to know about Unicode in Python 3 Marko Rauhamaa <marko@pacujo.net> - 2014-05-13 12:06 +0300
              Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 19:29 +1000
              Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve@pearwood.info> - 2014-05-13 09:44 +0000
            Re: Everything you did not want to know about Unicode in Python 3 Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-05-13 11:38 +0200
          Re: Everything you did not want to know about Unicode in Python 3 Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-05-13 11:46 +0200
            Re: Everything you did not want to know about Unicode in Python 3 Marko Rauhamaa <marko@pacujo.net> - 2014-05-13 12:59 +0300
          Re: Everything you did not want to know about Unicode in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-13 14:30 +0100
          Re: Everything you did not want to know about Unicode in Python 3 Chris Angelico <rosuav@gmail.com> - 2014-05-13 23:37 +1000
          Re: Everything you did not want to know about Unicode in Python 3 Skip Montanaro <skip@pobox.com> - 2014-05-13 09:02 -0500
        Re: Everything you did not want to know about Unicode in Python 3 wxjmfauth@gmail.com - 2014-05-14 00:00 -0700
      Re: Everything you did not want to know about Unicode in Python 3 alister <alister.nospam.ware@ntlworld.com> - 2014-05-13 11:19 +0000
        Re: Everything you did not want to know about Unicode in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2014-05-13 10:08 -0600
          Re: Everything you did not want to know about Unicode in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-05-14 00:10 +0000
            Re: Everything you did not want to know about Unicode in Python 3 Ethan Furman <ethan@stoneleaf.us> - 2014-05-13 17:53 -0700
            Re: Everything you did not want to know about Unicode in Python 3 Terry Reedy <tjreedy@udel.edu> - 2014-05-14 17:47 -0400
            Re: Everything you did not want to know about Unicode in Python 3 Antoine Pitrou <antoine@python.org> - 2014-05-16 11:50 +0000
              Re: Everything you did not want to know about Unicode in Python 3 wxjmfauth@gmail.com - 2014-05-16 06:20 -0700
          Re: Everything you did not want to know about Unicode in Python 3 alister <alister.nospam.ware@ntlworld.com> - 2014-05-14 12:38 +0000
        Re: Everything you did not want to know about Unicode in Python 3 Robin Becker <robin@reportlab.com> - 2014-05-14 16:30 +0100
        Re: Everything you did not want to know about Unicode in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2014-05-14 09:56 -0600

csiph-web