Re: Encoding of Python 2 string literals

From	Laura Creighton <lac@openend.se>
Subject	Re: Encoding of Python 2 string literals
References	<CAPkN8xK674+ruL=2gU9xHsuDAY0H3D_CBux8mY78ZYzo55gdHw@mail.gmail.com><CAPTjJmrZTXjz9FaBRgxR+AKcLK8DtdcrwnWZ=PnS-+tpyT2WPA@mail.gmail.com>
Date	2015-07-22 16:12 +0200
Newsgroups	comp.lang.python
Message-ID	<mailman.866.1437574380.3674.python-list@python.org> (permalink)

Show all headers | View raw

In a message of Wed, 22 Jul 2015 22:39:56 +1000, Chris Angelico writes:
>On Wed, Jul 22, 2015 at 8:17 PM, anatoly techtonik <techtonik@gmail.com> wrote:
>> Is there a way to know encoding of string (bytes) literal
>> defined in source file? For example, given that source:
>>
>>     # -*- coding: utf-8 -*-
>>     from library import Entry
>>     Entry("текст")
>>
>> Is there any way for Entry() constructor to know that
>> string "текст" passed into it is the utf-8 string?
>
>I don't think so. However, if you declare that to be a Unicode string,
>the parser will decode it using the declared encoding, and it'll be a
>five-character string. At that point, it doesn't matter what your
>source encoding was, because the characters entered will match the
>characters seen.
>
>Entry(u"текст")
>
>ChrisA

Since you are porting to 3.x, anatoly this will be of interest to you.
https://www.python.org/dev/peps/pep-0414/

Having stuck all the u" into your codebase you won't immediately
have to rip them all out again as long as you use Python 3.3 or above.

Laura

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread

Thread

Re: Encoding of Python 2 string literals Laura Creighton <lac@openend.se> - 2015-07-22 16:12 +0200

csiph-web