Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63537

Re: "More About Unicode in Python 2 and 3"

Path csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'encoding': 0.05; 'run- time': 0.05; 'subject:Python': 0.06; 'referring': 0.07; 'utf-8': 0.07; 'string': 0.09; 'armin': 0.09; 'ascii': 0.09; 'facts': 0.09; 'measure': 0.09; 'skip:t 60': 0.09; 'things,': 0.09; 'cc:addr :python-list': 0.11; 'python': 0.11; 'jan': 0.12; 'question.': 0.14; '2.7:': 0.16; '3.3,': 0.16; 'attempted': 0.16; 'complainers': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'function?': 0.16; 'hurts': 0.16; 'low.': 0.16; 'subject:More': 0.16; 'subject:Unicode': 0.16; 'wrote:': 0.18; 'thu,': 0.19; 'examples': 0.20; 'meant': 0.20; 'help.': 0.21; '>>>': 0.22; 'programming': 0.22; 'cc:addr:python.org': 0.22; 'helper': 0.24; 'unicode': 0.24; "haven't": 0.24; 'cc:2**0': 0.24; 'right.': 0.26; 'suggested': 0.26; 'post': 0.26; 'header:In-Reply- To:1': 0.27; 'am,': 0.29; "doesn't": 0.30; 'message- id:@mail.gmail.com': 0.30; 'code': 0.31; 'easier': 0.31; 'overhead': 0.31; 'subject:About': 0.31; 'could': 0.34; 'problem': 0.35; 'common': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'done': 0.36; 'list': 0.37; 'performance': 0.37; 'requiring': 0.38; 'explain': 0.39; 'extremely': 0.39; 'skip:8 10': 0.39; 'sure': 0.39; 'improved': 0.60; 'subject:"': 0.60; 'real': 0.63; 'such': 0.63; 'great': 0.65; 'internet': 0.71; 'to:none': 0.92
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type:content-transfer-encoding; bh=hFknGshb7WVxbKoStLdJBdU8Ex2FRAhTDK4mJ+/gu/E=; b=gjUt0Jy2FRCAHtq5qNWwmBOTQbumVRFt2j65UCRZSro9YlsNQOkN+uO2UzGDFzG2ga jrKmsklfpuA+Jt32i3HzJMTiuS5J0W24FIXp372WA8iRlNcJoLxWJzncv2pzhvxXPLBp gu2REnLEjxbCf49njxSJMBTFSkBReRWZ6SmAjsUWU4AvNCOYX8cTznjNVkxUOVNg7bMY ZQQe7kazgE6iTgo58uDJ3wz1D7CwhqyEQ48Rnw46gN+fJNK8RFVST46CMIq/eBThUEp4 pcbfYTRti45uJJtBQdtkV4W0D15pNbLijBtHfNMtJuNXxpOQOYJTwGKTXkV5oe2aDApN vHUw==
MIME-Version 1.0
X-Received by 10.68.108.194 with SMTP id hm2mr35934pbb.22.1389224738089; Wed, 08 Jan 2014 15:45:38 -0800 (PST)
In-Reply-To <ad02bdfe-ef6e-4e2f-950f-d28fe62f5139@googlegroups.com>
References <mailman.4942.1388927706.18130.python-list@python.org> <ad02bdfe-ef6e-4e2f-950f-d28fe62f5139@googlegroups.com>
Date Thu, 9 Jan 2014 10:45:37 +1100
Subject Re: "More About Unicode in Python 2 and 3"
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5208.1389224741.18130.python-list@python.org> (permalink)
Lines 32
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1389224741 news.xs4all.nl 2915 [2001:888:2000:d::a6]:42067
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:63537

Show key headers only | View raw


On Thu, Jan 9, 2014 at 10:34 AM,  <rdsteph@mac.com> wrote:
> I just meant to say that internet programming using ASCII urls is so common and important that it hurts that Python 3 makes it so much harder. It sure would be great if Python 3 could be improved to allow such programming to be done using ASCII urls without requiring all the unicode overhead.
>
> Armin is right. Calling his post a rant doesn't help.

There's one big problem with that theory. We've been looking, on this
list and on python-ideas, at some practical suggestions for adding
something to Py3 that will help. So far, lots of people have suggested
things, and the complainers haven't attempted to explain what they
actually need. Hard facts and examples would help enormously.

Incidentally, before referring to "all the Unicode overhead", it would
help to actually measure the overhead of encoding and decoding.

Python 2.7:
>>> timeit.timeit("a.encode().decode()","a=u'a'*1000",number=500000)
8.787162614242874

Python 3.4:
>>> timeit.timeit("a.encode().decode()","a=u'a'*1000",number=500000)
1.7354552045022515

Since 3.3, the cost of UTF-8 encoding/decoding an all-ASCII string is
extremely low. So the real cost isn't in run-time performance but in
code complexity. Would it be easier to work with ASCII URLs with a
one-letter-name helper function? I never got an answer to that
question.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

"More About Unicode in Python 2 and 3" Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-05 13:14 +0000
  Re: "More About Unicode in Python 2 and 3" rdsteph@mac.com - 2014-01-08 15:34 -0800
    Re: "More About Unicode in Python 2 and 3" Chris Angelico <rosuav@gmail.com> - 2014-01-09 10:45 +1100
    Re: "More About Unicode in Python 2 and 3" Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-08 23:53 +0000

csiph-web