Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #35001
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <d@davea.name> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.003 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'string.': 0.04; 'encoded': 0.05; 'linux,': 0.05; 'reason,': 0.07; 'utf-8': 0.07; 'python': 0.09; 'encode': 0.09; 'encoding.': 0.09; 'linux.': 0.09; 'url:%s': 0.09; 'cc:addr:python-list': 0.10; '2.7': 0.13; 'codec': 0.16; 'ordinal': 0.16; 'subject:Unicode': 0.16; 'url:mi': 0.16; 'urllib': 0.16; 'wrote:': 0.17; 'fix': 0.17; 'string,': 0.17; 'unicode': 0.17; 'windows': 0.19; 'import': 0.21; '"",': 0.22; '(on': 0.22; 'cc:2**0': 0.23; 'cc:no real name:2**0': 0.24; 'linux': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'creating': 0.26; '(most': 0.27; 'there.': 0.28; '>>>>': 0.29; 'src': 0.29; 'character': 0.29; 'maybe': 0.29; 'error': 0.30; 'file': 0.32; 'print': 0.32; 'traceback': 0.33; 'ubuntu': 0.33; 'hi,': 0.33; "can't": 0.34; 'changed': 0.34; 'whatever': 0.35; 'pm,': 0.35; 'there': 0.35; 'skip:u 20': 0.36; 'but': 0.36; 'data.': 0.36; 'should': 0.36; 'uses': 0.37; 'subject:: ': 0.38; 'page': 0.38; 'received:192': 0.39; 'received:192.168': 0.40; 'skip:u 10': 0.60; 'url:index': 0.61; 'here': 0.65; 'url:cgi': 0.65; 'header:Reply-To:1': 0.68; 'received:74.208': 0.71; 'reply-to:no real name:2**0': 0.72; 'opener': 0.84; 'url:lang': 0.84; 'yours': 0.88; 'url:biz': 0.91; 'to:none': 0.93; 'catalog': 0.93; 'url:fr': 0.95 |
| Date | Mon, 17 Dec 2012 13:07:46 -0500 |
| From | Dave Angel <d@davea.name> |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121011 Thunderbird/16.0.1 |
| MIME-Version | 1.0 |
| CC | python-list@python.org |
| Subject | Re: Unicode |
| References | <mailman.941.1355692240.29569.python-list@python.org> <50ceb674$0$29868$c3e8da3$5496439d@news.astraweb.com> <CAKhY55MLBeT-xLwqy59gusU3H2o_pceLKDxY-8XVifU_Ns2yrg@mail.gmail.com> <CAMuTYXis2vH5xjmHAgrESquPDQsAYWkzFnWGfDeyE9K5-Nwiww@mail.gmail.com> <CAKhY55OE+FdjR-EyXToE5TuWEMLwAi4n-NeuFhtKnOtZ=ey2DA@mail.gmail.com> <CAHzaPEM_sp=0aEtbxVPYYvea=_DuE36P9ZwtNGAVjnXCaykNaw@mail.gmail.com> <CAKhY55P3x9-WS52D5i+E+rJ7y2osGHnqTZwB2TpBK4zUSe0ouw@mail.gmail.com> <CAHzaPEPigppu_OGO+oujXJXNxtq++GeQcPAqOPnwfegC8dE+Tg@mail.gmail.com> <CAKhY55PdHrbqOw=3Gp3Pva57sAYrHuh5pFwy5mOSLFnV0ekoig@mail.gmail.com> <CAHzaPEMECAgRhgWro0mMn4UtimYL6NQX1FFASEvm5dqr5krXyA@mail.gmail.com> <CAKhY55MHefj=RjThxacsdsd8NJoC7WmVzYkahObrgCuA2SMuFg@mail.gmail.com> |
| In-Reply-To | <CAKhY55MHefj=RjThxacsdsd8NJoC7WmVzYkahObrgCuA2SMuFg@mail.gmail.com> |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | 7bit |
| X-Provags-ID | V02:K0:ItixEmeiKC346Zmi97y/A3ECH6zbTkIfw5Q+5/q4Edb ZVpEWK6BxOm4UCwPeR7jhGCWPBYsiVSt54+8ah9IIgJn5X+jx2 Q1WlWTJ+q9ZRPLSrqKOhpiJlHRFjm8qr0OkCApPbngO7jTxk5U w9bmjRBybZA7qw0ZXOBvOcjZsG5YnOICr2EVAZYcN3BVrxmH2B rDiA49d2XlftDzgufmv+y4jLjLAU2Cz59mgvkvrtDZ3KvuORW9 fGpMyJWAt8TuBAklNa2tMR0puQNw+poa2tdW0K21H2I3gVZGWY QIYEiXTsbejxYeA2/cSoBcHQ6ycbBZpGJ+kFaZP7BRpmZCEWg= = |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| Reply-To | d@davea.name |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.979.1355767688.29569.python-list@python.org> (permalink) |
| Lines | 34 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1355767688 news.xs4all.nl 6955 [2001:888:2000:d::a6]:49250 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:35001 |
Show key headers only | View raw
On 12/17/2012 12:43 PM, Anatoli Hristov wrote:
>> Hi,
>> I don't know, what the product ID would look like, for this page, but
>> assuming, the catalog pages are also utf-8 encoded as well as the
>> error page I get, it should work ok; cf.:
> You are right, I get it work on Windows too, but not in Linux. I
> changed the codec of linux, but still I don't get it
>
> Here is what I get from Linux:
>
>>>> import urllib
>>>> opener = urllib.FancyURLopener({})
>>>> ffr = opener.open("http://prf.icecat.biz/index.cgi?product_id=%s;mi=start;smi=product;shopname=openICEcat-url;lang=fr" % (14688538))
>>>> src = ffr.read()
>>>> print src.decode("utf-8")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2122'
> in position 17167: ordinal not in range(256)
I can tell you what's happening, but maybe not how to fix it.
src.decode() is creating a unicode string. The error is not happening
there. But when print is used with a unicode string, it has to encode
the data. And for whatever reason, yours is using latin-1, and you have
a character in there which is not in the latin-1 encoding.
My python 2.7 uses utf-8 everywhere (on Linux Ubuntu 11.04).
--
DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-16 22:10 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-17 06:06 +0000
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 09:59 +0100
Re: Unicode Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-12-17 01:28 -0800
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 10:45 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 11:17 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 11:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 12:14 +0100
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 12:56 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 18:43 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 13:07 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 19:36 +0100
Re: Unicode Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-12-18 00:07 +0000
Re: Unicode Vlastimil Brom <vlastimil.brom@gmail.com> - 2012-12-17 20:55 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 21:00 +0100
Re: Unicode Dave Angel <d@davea.name> - 2012-12-17 16:09 -0500
Re: Unicode Hans Mulder <hansmu@xs4all.nl> - 2012-12-17 23:02 +0100
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:33 +0100
Re: Unicode Terry Reedy <tjreedy@udel.edu> - 2012-12-17 17:03 -0500
Re: Unicode Anatoli Hristov <tolidtm@gmail.com> - 2012-12-17 23:31 +0100
csiph-web