Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #94367

Re: Encoding of Python 2 string literals

Path csiph.com!optima2.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <lac@openend.se>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'string.': 0.04; 'subject:Python': 0.05; '-*-': 0.07; 'constructor': 0.07; 'utf-8': 0.07; 'cc:addr:python-list': 0.09; '22,': 0.09; 'coding:': 0.09; 'literal': 0.09; 'porting': 0.09; 'received:openend.se': 0.09; 'received:theraft.openend.se': 0.09; 'subject:string': 0.09; 'python': 0.10; 'wed,': 0.15; 'encoding': 0.15; '>on': 0.16; 'anatoly': 0.16; 'cc:addr:lac': 0.16; 'cc:addr:openend.se': 0.16; 'codebase': 0.16; 'decode': 0.16; 'file?': 0.16; 'from:addr:lac': 0.16; 'from:addr:openend.se': 0.16; 'from:name:laura creighton': 0.16; 'message-id:@fido.openend.se': 0.16; 'received:89.233': 0.16; 'received:89.233.217': 0.16; 'received:89.233.217.133': 0.16; 'received:fido': 0.16; 'received:fido.openend.se': 0.16; 'skip:> 20': 0.16; 'wrote:': 0.16; 'string': 0.17; 'laura': 0.18; 'string,': 0.18; 'library': 0.20; '2015': 0.20; 'cc:addr:python.org': 0.20; 'cc:2**1': 0.22; 'so.': 0.22; 'parser': 0.22; 'cc:no real name:2**0': 0.22; 'defined': 0.23; 'import': 0.24; 'header:In-Reply-To:1': 0.24; "doesn't": 0.26; 'chris': 0.26; 'entered': 0.27; 'declared': 0.29; 'received:se': 0.29; 'url:peps': 0.29; 'entry': 0.31; 'source': 0.33; 'url:python': 0.33; 'point,': 0.33; 'skip:> 10': 0.35; 'unicode': 0.35; 'url:dev': 0.35; 'there': 0.36; 'url:org': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; "won't": 0.38; 'your': 0.60; 'header :Message-Id:1': 0.61; 'matter': 0.63; 'interest': 0.64; 'you.': 0.64; 'jul': 0.72; 'received:89': 0.80; '+1000,': 0.84; 'rip': 0.84; 'seen.': 0.84; 'source:': 0.84
To Chris Angelico <rosuav@gmail.com>
cc "python-list@python.org" <python-list@python.org>, lac@openend.se
From Laura Creighton <lac@openend.se>
Subject Re: Encoding of Python 2 string literals
In-Reply-To Message from Chris Angelico <rosuav@gmail.com> of "Wed, 22 Jul 2015 22:39:56 +1000." <CAPTjJmrZTXjz9FaBRgxR+AKcLK8DtdcrwnWZ=PnS-+tpyT2WPA@mail.gmail.com>
References <CAPkN8xK674+ruL=2gU9xHsuDAY0H3D_CBux8mY78ZYzo55gdHw@mail.gmail.com><CAPTjJmrZTXjz9FaBRgxR+AKcLK8DtdcrwnWZ=PnS-+tpyT2WPA@mail.gmail.com>
MIME-Version 1.0
Content-Type text/plain; charset="UTF-8"
Content-ID <24263.1437574369.1@fido>
Content-Transfer-Encoding quoted-printable
Date Wed, 22 Jul 2015 16:12:49 +0200
X-Greylist Sender IP whitelisted, not delayed by milter-greylist-4.3.9 (theraft.openend.se [89.233.217.130]); Wed, 22 Jul 2015 16:12:51 +0200 (CEST)
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.866.1437574380.3674.python-list@python.org> (permalink)
Lines 31
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1437574380 news.xs4all.nl 2950 [2001:888:2000:d::a6]:35342
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:94367

Show key headers only | View raw


In a message of Wed, 22 Jul 2015 22:39:56 +1000, Chris Angelico writes:
>On Wed, Jul 22, 2015 at 8:17 PM, anatoly techtonik <techtonik@gmail.com> wrote:
>> Is there a way to know encoding of string (bytes) literal
>> defined in source file? For example, given that source:
>>
>>     # -*- coding: utf-8 -*-
>>     from library import Entry
>>     Entry("текст")
>>
>> Is there any way for Entry() constructor to know that
>> string "текст" passed into it is the utf-8 string?
>
>I don't think so. However, if you declare that to be a Unicode string,
>the parser will decode it using the declared encoding, and it'll be a
>five-character string. At that point, it doesn't matter what your
>source encoding was, because the characters entered will match the
>characters seen.
>
>Entry(u"текст")
>
>ChrisA

Since you are porting to 3.x, anatoly this will be of interest to you.
https://www.python.org/dev/peps/pep-0414/

Having stuck all the u" into your codebase you won't immediately
have to rip them all out again as long as you use Python 3.3 or above.

Laura

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Encoding of Python 2 string literals Laura Creighton <lac@openend.se> - 2015-07-22 16:12 +0200

csiph-web