Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #34970
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <vlastimil.brom@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.001 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'encoded': 0.05; 'retrieved': 0.05; 'badly': 0.07; 'processing.': 0.07; 'utf-8': 0.07; 'python': 0.09; 'encode': 0.09; 'encoding.': 0.09; 'inserted': 0.09; 'portable': 0.09; 'encoding': 0.15; 'file,': 0.15; 'skip:p 40': 0.15; 'codec': 0.16; 'decode': 0.16; 'mangled': 0.16; 'ordinal': 0.16; 'partly': 0.16; 'printing.': 0.16; 'subject:Unicode': 0.16; 'settings': 0.16; 'string': 0.17; 'unicode': 0.17; '(or': 0.18; 'respective': 0.20; '"",': 0.22; 'insert': 0.23; 'seems': 0.23; 'tried': 0.25; 'header:In-Reply- To:1': 0.25; '(most': 0.27; 'handling': 0.27; 'possibly': 0.27; 'i.e.': 0.27; 'see,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'post': 0.28; '>>>>': 0.29; 'ansi': 0.29; 'character': 0.29; 'source': 0.29; 'error': 0.30; 'file': 0.32; 'print': 0.32; 'traceback': 0.33; 'to:addr:python-list': 0.33; 'likely': 0.33; 'hi,': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'text': 0.34; 'doing': 0.35; 'received:209.85.220': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'but': 0.36; 'thank': 0.36; 'problems': 0.36; 'previous': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'to:addr:python.org': 0.39; 'where': 0.40; 'header:Received:5': 0.40; 'your': 0.60; 'skip:u 10': 0.60; 'results': 0.65; 'receive': 0.71; 'entry,': 0.84; 'stored,': 0.84; 'capability': 0.91; 'step.': 0.91; 'cause,': 0.93 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=URotRQ4gxlNmqGWtZVK08aPO59jgzgW3UDh0mHUCb8E=; b=jQ9u8IvAeFhMSayM4eSbOGUwX0Ov8r5CFZs3VYZhIBqaawVdWE69ltmG5N84DOPR55 7C06ks6Ws7FJ0PQp/kVt7eqGTp1YQ3hPA5LQr4G6Uo5Tsm9YhpT7NxbZn1PA5mzbRknl jCzuMsm+IKZXZteCanSFfXHDxCj2+LMQTedRXx4BAgSZj91SaQaB1gHP6G5isp/P4Sji wScixx/aLdQ8MLBrAturQUp8ZQu2fVBE6YhsJSTGz81T4IHWZLcUSzq+eRPvyr+Fhrej jubrUBJBdr5T2ZgCwopUs4Anpr68bRzn14nTZGtLpKZ1gRJiDnTiXZ3kARdAlxW2rkdQ ey6Q== |
| MIME-Version | 1.0 |
| In-Reply-To | <CAKhY55P3x9-WS52D5i+E+rJ7y2osGHnqTZwB2TpBK4zUSe0ouw@mail.gmail.com> |
| References | <mailman.941.1355692240.29569.python-list@python.org> <50ceb674$0$29868$c3e8da3$5496439d@news.astraweb.com> <CAKhY55MLBeT-xLwqy59gusU3H2o_pceLKDxY-8XVifU_Ns2yrg@mail.gmail.com> <CAMuTYXis2vH5xjmHAgrESquPDQsAYWkzFnWGfDeyE9K5-Nwiww@mail.gmail.com> <CAKhY55OE+FdjR-EyXToE5TuWEMLwAi4n-NeuFhtKnOtZ=ey2DA@mail.gmail.com> <CAHzaPEM_sp=0aEtbxVPYYvea=_DuE36P9ZwtNGAVjnXCaykNaw@mail.gmail.com> <CAKhY55P3x9-WS52D5i+E+rJ7y2osGHnqTZwB2TpBK4zUSe0ouw@mail.gmail.com> |
| Date | Mon, 17 Dec 2012 11:55:10 +0100 |
| Subject | Re: Unicode |
| From | Vlastimil Brom <vlastimil.brom@gmail.com> |
| To | python-list@python.org |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | quoted-printable |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.955.1355741714.29569.python-list@python.org> (permalink) |
| Lines | 45 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1355741714 news.xs4all.nl 6889 [2001:888:2000:d::a6]:59325 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:34970 |
Show key headers only | View raw
2012/12/17 Anatoli Hristov <tolidtm@gmail.com>:
>> if you only see encoding problems on printing results to your
>> terminal, its settings or unicode capability might be the cause,
>> however, if you also get badly encoding items in the database, you are
>> likely using an inappropriate encoding in some step.
>
> I get badly encoding into my DB
>
>> you seem to be doing something like the following (explicitly or
>> partly implicitly, based on your system defaults):
>>
>>>>> print u"étroits, en utilisant un portable extrêmement puissant".encode("utf-8").decode("windows-1252")
>> étroits, en utilisant un portable extrêmement puissant
>>>>>
>>
>> i.e. encode a text using utf-8 and handling it like windows-1252
>> afterwards (or take an already encoded text and decode it with the
>> inappropriate ANSI encoding.
>
> Thank you Vlastimil,
>
> I tried to print it as you sholed mr, but I receive an erro:
>>>> print u"étroits, en utilisant un portable extrêmement puissant".encode("utf-8").decode("windows-1252")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0192'
> in position 1: ordinal not in range(256)
>>>>
Hi,
this seems to be an encoding error of your terminal on printing.
You may need to describe (or better post the respective parts of the
source) where the text is coming from (external text file, database
entry, harcoded in the python source ...), how it is stored, retrieved
and possibly manipulated before you insert it to the database.
You may try to print a repr(...) of the string to be inserted to the
database to see, whether it isn't already mangled in some previous
part of the processing.
hth,
vbr
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-16 22:10 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-17 06:06 +0000
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 09:59 +0100
Re: Unicode Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-12-17 01:28 -0800
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 10:45 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 11:17 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 12:14 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 12:56 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 18:43 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 13:07 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 19:36 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-18 00:07 +0000
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 20:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 21:00 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 16:09 -0500
Re: Unicode Hans Mulder <hansmu@xs4all.nl> - 2012-12-17 23:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:33 +0100
Re: Unicode Terry Reedy <tjreedy@udel.edu> - 2012-12-17 17:03 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:31 +0100
csiph-web