Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin1!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'utf-8': 0.07; '(it': 0.09; 'encode': 0.09; 'subject:script': 0.09; 'tab': 0.09; 'cc:addr :python-list': 0.10; 'subject:not': 0.11; 'encoding': 0.15; '#this': 0.16; '.txt': 0.16; 'cc:name:python list': 0.16; 'eclipse': 0.16; 'fabio': 0.16; 'mistake.': 0.16; 'received:74.125.82.46': 0.16; 'subject:when': 0.16; 'unicode?': 0.16; 'wrote:': 0.17; 'skip:u 30': 0.17; 'specify': 0.17; 'unicode': 0.17; '(in': 0.18; 'feb': 0.19; 'email addr:gmail.com>': 0.20; 'cheers,': 0.23; 'cc:2**0': 0.23; 'work.': 0.23; '>': 0.23; 'sets': 0.23; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'skip:" 20': 0.26; 'message- id:@mail.gmail.com': 0.27; 'appending': 0.29; 'didnt': 0.29; 'url:mailman': 0.29; 'probably': 0.29; 'this.': 0.29; 'keyword': 0.30; 'url:python': 0.32; 'file': 0.32; 'launch': 0.32; 'url:listinfo': 0.32; 'received:74.125.82': 0.33; 'received:google.com': 0.34; 'doing': 0.35; 'pm,': 0.35; 'something': 0.35; 'next': 0.35; 'really': 0.36; 'skip:u 20': 0.36; 'but': 0.36; 'received:74.125': 0.36; 'url:org': 0.36; '12,': 0.36; 'test': 0.36; 'does': 0.37; 'why': 0.37; 'subject:: ': 0.38; 'behind': 0.38; 'page': 0.38; 'url:mail': 0.40; 'your': 0.60; 'skip:u 10': 0.60; 'dont': 0.64; '.....': 0.75; '2013': 0.84; 'subject:running': 0.84; 'scenes': 0.91; 'trouble.': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=GvqCHHJoRTjaay5MiEpbJ3+KtUuTBRcBavBfnZ5XIh0=; b=0CgBq2FYz0oOghp5+j1URq3MglgopYgu5UNoMBE1eWN2BIGBD39zv6c1qtZjWstDI8 fvCHlmbXagiDxCa4cyOXidkwpyzBBgdIGTDSfGabbgQ1zyzV/tUchUv6Y6iN8Bw418vP T8IjD15l8iyzP3XUgHpHNIqQmEH9hjAfnrwAHaUcqW1YdfheCbUssJlPTEz7ZG6HNq11 7mpSpJYEc+Ps5jqA2iuIkezKfnv3Vv4nmMMSoNdqbkle5b7g/CBhYfP/RuDgb9aPMIvY h225CYabSu2zvTgSqNGLQha9uHL8k219dLMyuXe1OTqqk4Ddlecaa/xFXzMc9tZa5f// NXCw== X-Received: by 10.180.90.147 with SMTP id bw19mr5656734wib.28.1360699515059; Tue, 12 Feb 2013 12:05:15 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <650d144e-da3d-4ca7-ad3a-49f44ce9cbaa@googlegroups.com> <0d6d513d-fa12-4d51-a33d-7bb38f1ee6b2@googlegroups.com> <780d353a-de5c-4d04-8f51-11d81802351b@googlegroups.com> From: Fabio Zadrozny Date: Tue, 12 Feb 2013 18:04:54 -0200 Subject: Re: UnicodeEncodeError when not running script from IDE To: Magnus Pettersson Content-Type: multipart/alternative; boundary=f46d043892513dbf4e04d58c8b09 Cc: python list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 102 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1360699522 news.xs4all.nl 6977 [2001:888:2000:d::a6]:56458 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:38779 --f46d043892513dbf4e04d58c8b09 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Just to note, PyDev does something behind the scenes (it sets the encoding for the console). You may specify which encoding you want at your launch configuration (in the 'common' tab you can set the encoding you want for the shell). Cheers, Fabio On Tue, Feb 12, 2013 at 3:12 PM, Magnus Pettersson wrote: > > What encoding is this file? Since you're appending to it, you really > > > > need to match the pre-existing encoding, or the next program to deal > > > > with it is in big trouble. So using the io.open() without the encoding= =3D > > > > keyword is probably a mistake. > > The .txt file is in UTF-8 > > I have got it to work now in the terminal, but i dont understand what im > doing and why i didnt need to do all the unicode strings and encode mumbo > jumbo in eclipse > > #Here kanji =3D u"=E7=A7=81" > baseurl =3D u"http://www.romajidesu.com/kanji/" > url =3D baseurl+kanji > savefile([url]) #this test works now. uses: io.open(filepath, > "a",encoding=3D"UTF-8") as f: > # This made the fetching of the website work. Why did i have to write > url.encode("UTF-8") when url already is unicode? I feel i dont have a goo= d > understanding of this. > page =3D urllib2.urlopen(url.encode("UTF-8")) > > > .... > -- > http://mail.python.org/mailman/listinfo/python-list > --f46d043892513dbf4e04d58c8b09 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Just to note, PyDev does something behind the scenes (it s= ets the encoding for the console).

You may specify which= encoding you want at your launch configuration (in the 'common' ta= b you can set the encoding you want for the shell).

Cheers,

Fabi= o


= On Tue, Feb 12, 2013 at 3:12 PM, Magnus Pettersson <magpettersson@gm= ail.com> wrote:
> What encoding is this= file? =C2=A0Since you're appending to it, you really
>
> need to match the pre-existing encoding, or the next program to deal >
> with it is in big trouble. =C2=A0So using the io.open() without the en= coding=3D
>
> keyword is probably a mistake.

The .txt file is in UTF-8

I have got it to work now in the terminal, but i dont understand what im do= ing and why i didnt need to do all the unicode strings and encode mumbo jum= bo in eclipse

#Here kanji =3D u"=E7=A7=81"
baseurl =3D u"http://www.romajidesu.com/kanji/"
url =3D baseurl+kanji
savefile([url]) #this test works now. uses: io.open(filepath, "a"= ,encoding=3D"UTF-8") as f:
# This made the fetching of the website work. Why did i have to write url.e= ncode("UTF-8") when url already is unicode? I feel i dont have a = good understanding of this.
page =3D urllib2.urlopen(url.encode("UTF-8"))


....
--
http://mail.python.org/mailman/listinfo/python-list

--f46d043892513dbf4e04d58c8b09--