Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #5177

Re: unicode by default

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <lis01361@izone.net.au>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.011
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'escape': 0.04; 'bytes.': 0.07; 'encoding.': 0.09; 'output': 0.12; 'am,': 0.14; 'wrote:': 0.14; 'contrary.': 0.16; 'received:203.24': 0.16; 'subject:unicode': 0.16; 'input': 0.18; 'bytes': 0.19; '(or': 0.22; 'code': 0.22; 'header:In-Reply-To:1': 0.22; 'e.g.': 0.22; 'thu,': 0.22; 'sequences.': 0.23; 'byte': 0.25; 'extract': 0.25; 'specify': 0.25; 'assume': 0.25; 'unicode': 0.29; 'character.': 0.31; 'to:addr:python-list': 0.32; 'using': 0.34; 'header:User- Agent:1': 0.35; 'quite': 0.36; 'sequence': 0.38; 'files': 0.38; 'unless': 0.38; 'to:addr:python.org': 0.39; '2011': 0.62; 'care': 0.67; 'reply-to:no real name:2**0': 0.72; 'header:Reply-To:1': 0.72; 'consumer': 0.80; 'you).': 0.91
In-Reply-To <vpEyp.981$dL5.736@newsfe08.iad>
References <OkDyp.2983$M61.450@newsfe07.iad> <mailman.1433.1305151801.9059.python-list@python.org> <vpEyp.981$dL5.736@newsfe08.iad>
Date Thu, 12 May 2011 09:32:06 +1000
Subject Re: unicode by default
From "John Machin" <sjmachin@lexicon.net>
To python-list@python.org
User-Agent SquirrelMail/1.4.21
MIME-Version 1.0
Content-Type text/plain;charset=iso-8859-1
Content-Transfer-Encoding 8bit
X-Priority 3 (Normal)
Importance Normal
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
Reply-To sjmachin@lexicon.net
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1435.1305157329.9059.python-list@python.org> (permalink)
Lines 19
NNTP-Posting-Host 82.94.164.166
X-Trace 1305157330 news.xs4all.nl 41102 [::ffff:82.94.164.166]:34099
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:5177

Show key headers only | View raw


On Thu, May 12, 2011 8:51 am, harrismh777 wrote:
> Is it true that if I am
> working without using bytes sequences that I will not need to care about
> the encoding anyway, unless of course I need to specify a unicode code
> point?

Quite the contrary.

(1) You cannot work without using bytes sequences. Files are byte
sequences. Web communication is in bytes. You need to (know / assume / be
able to extract / guess) the input encoding. You need to encode your
output using an encoding that is expected by the consumer (or use an
output method that will do it for you).

(2) You don't need to use bytes to specify a Unicode code point. Just use
an escape sequence e.g. "\u0404" is a Cyrillic character.


Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-11 16:37 -0500
  Re: unicode by default Ian Kelly <ian.g.kelly@gmail.com> - 2011-05-11 16:09 -0600
    Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-11 17:51 -0500
      Re: unicode by default "John Machin" <sjmachin@lexicon.net> - 2011-05-12 09:32 +1000
        Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-11 20:22 -0500
          Re: unicode by default MRAB <python@mrabarnett.plus.com> - 2011-05-12 03:31 +0100
            Re: unicode by default Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-05-12 03:16 +0000
              Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-11 22:44 -0500
                Re: unicode by default Terry Reedy <tjreedy@udel.edu> - 2011-05-12 00:12 -0400
                Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-12 01:43 -0500
                Re: unicode by default "John Machin" <sjmachin@lexicon.net> - 2011-05-12 14:14 +1000
                Re: unicode by default Benjamin Kaplan <benjamin.kaplan@case.edu> - 2011-05-11 21:14 -0700
                Re: unicode by default "John Machin" <sjmachin@lexicon.net> - 2011-05-12 14:41 +1000
                Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-12 01:14 -0500
                Re: unicode by default TheSaint <nobody@nowhere.net.no> - 2011-05-12 20:40 +0800
            Re: unicode by default Ben Finney <ben+python@benfinney.id.au> - 2011-05-12 14:07 +1000
              Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-12 01:31 -0500
                Re: unicode by default "John Machin" <sjmachin@lexicon.net> - 2011-05-12 17:58 +1000
                Re: unicode by default Ian Kelly <ian.g.kelly@gmail.com> - 2011-05-12 10:17 -0600
                Re: unicode by default jmfauth <wxjmfauth@gmail.com> - 2011-05-12 23:28 -0700
                Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-13 14:53 -0500
                Re: unicode by default Robert Kern <robert.kern@gmail.com> - 2011-05-13 15:18 -0500
                Re: unicode by default Terry Reedy <tjreedy@udel.edu> - 2011-05-13 21:41 -0400
                Re: unicode by default harrismh777 <harrismh777@charter.net> - 2011-05-14 02:41 -0500
                Re: unicode by default jmfauth <wxjmfauth@gmail.com> - 2011-05-14 03:26 -0700
                Re: unicode by default Terry Reedy <tjreedy@udel.edu> - 2011-05-14 16:26 -0400
                Re: unicode by default Ben Finney <ben+python@benfinney.id.au> - 2011-05-15 09:47 +1000
                Re: unicode by default Nobody <nobody@nowhere.com> - 2011-05-14 09:34 +0100
                Re: unicode by default Terry Reedy <tjreedy@udel.edu> - 2011-05-12 16:42 -0400
                Re: unicode by default Ian Kelly <ian.g.kelly@gmail.com> - 2011-05-12 16:25 -0600
          Re: unicode by default "John Machin" <sjmachin@lexicon.net> - 2011-05-12 13:54 +1000
  Re: unicode by default Benjamin Kaplan <benjamin.kaplan@case.edu> - 2011-05-11 15:34 -0700

csiph-web