Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #38803
| Path | csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <davea@davea.name> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'syntax': 0.03; 'output': 0.04; 'attribute': 0.05; 'compiler': 0.05; 'defaults': 0.05; 'differently': 0.07; 'escape': 0.07; 'exception.': 0.07; 'nasty': 0.07; 'utf-8': 0.07; 'missed': 0.09; 'python': 0.09; 'editor.': 0.09; 'encode': 0.09; 'ide': 0.09; 'literal': 0.09; 'non-ascii': 0.09; 'subject:script': 0.09; 'unicode,': 0.09; 'subject:not': 0.11; 'encoding': 0.15; 'file,': 0.15; 'skip:f 30': 0.15; 'code?': 0.16; 'codec': 0.16; 'decode': 0.16; 'decoding': 0.16; 'disaster.': 0.16; 'eclipse': 0.16; 'editor,': 0.16; 'encodings,': 0.16; 'fuzzy': 0.16; 'interpreted.': 0.16; 'line)': 0.16; 'literals': 0.16; 'mixture': 0.16; 'subject:when': 0.16; 'string': 0.17; 'wrote:': 0.17; 'basically': 0.17; 'byte': 0.17; 'bytes': 0.17; 'have:': 0.17; 'unicode': 0.17; 'handles': 0.18; 'code,': 0.18; 'input': 0.18; '(not': 0.20; 'variable': 0.20; 'changes': 0.20; 'file.': 0.20; 'import': 0.21; 'error.': 0.21; '2.x': 0.22; 'sets': 0.23; 'script': 0.24; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; '(which': 0.26; 'creating': 0.26; 'setting': 0.26; 'wrote': 0.26; 'guess': 0.27; 'question': 0.27; 'easiest': 0.27; 'possible,': 0.27; 'lines': 0.28; 'run': 0.28; 'post': 0.28; 'accidentally': 0.29; 'didnt': 0.29; 'equivalent.': 0.29; 'optional': 0.29; 'convert': 0.29; 'source': 0.29; "i'm": 0.29; 'maybe': 0.29; 'function': 0.30; 'code': 0.31; 'file': 0.32; 'running': 0.32; 'getting': 0.33; 'goes': 0.33; 'to:addr:python- list': 0.33; 'form.': 0.33; 'program,': 0.34; 'text': 0.34; 'done': 0.34; 'needed': 0.35; 'pm,': 0.35; 'there': 0.35; 'really': 0.36; 'but': 0.36; 'characters': 0.36; "didn't": 0.36; 'should': 0.36; 'thank': 0.36; 'does': 0.37; 'why': 0.37; 'previous': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'mean': 0.38; 'some': 0.38; 'things': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'short': 0.39; 'little': 0.39; 'where': 0.40; 'received:192.168': 0.40; 'your': 0.60; 'real': 0.61; '(that': 0.62; 'necessarily': 0.63; 'more': 0.63; 'show': 0.63; 'within': 0.64; 'dangerous': 0.66; 'received:74.208': 0.71; 'strings)': 0.84; 'subject:running': 0.84; 'thing,': 0.84; 'lucky': 0.96 |
| Date | Tue, 12 Feb 2013 22:51:57 -0500 |
| From | Dave Angel <davea@davea.name> |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: UnicodeEncodeError when not running script from IDE |
| References | <650d144e-da3d-4ca7-ad3a-49f44ce9cbaa@googlegroups.com> <mailman.1696.1360666894.2939.python-list@python.org> <0d6d513d-fa12-4d51-a33d-7bb38f1ee6b2@googlegroups.com> <mailman.1700.1360680572.2939.python-list@python.org> <780d353a-de5c-4d04-8f51-11d81802351b@googlegroups.com> <mailman.1711.1360684727.2939.python-list@python.org> <a80a49be-b3c4-4549-bf94-523605dbbeec@googlegroups.com> <mailman.1725.1360702303.2939.python-list@python.org> <d7da5405-7de9-4eb4-935e-fafc131194d9@googlegroups.com> |
| In-Reply-To | <d7da5405-7de9-4eb4-935e-fafc131194d9@googlegroups.com> |
| Content-Type | text/plain; charset=ISO-8859-1; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-Provags-ID | V02:K0:r70TdbVhEPP5ZfiBMZnCLocDnBScqaxCxMDRFaCjTBs zsYf4xbqLfueQf2dI1Zo7pGlYejp2tfvi8XMWpztPwZUMC3WkU qXqkykhqYDtgh6fALqQNI306Pz7Waf5jMneGk+k33VFDYj2Uoc Go/gnAoO4uaZB7HwDt2X/z5ofkA0kUt+j+7JXEanQgOsi3SbMe sFjipvZS6nvVtB+OYF5x2JHTmLFI2Lh1FZa7YfrMcW28d27Sxj ZzsUuDxblSyzk+qh6f/4eQnU6krIAUrRdYe0GGGrGxLbDbENkE cpSqq4qUy+HqhDxR5CsjQySy1x9vK2l0kUfrCryowAyRp31eA= = |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.1734.1360727537.2939.python-list@python.org> (permalink) |
| Lines | 58 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1360727537 news.xs4all.nl 6888 [2001:888:2000:d::a6]:33898 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:38803 |
Show key headers only | View raw
On 02/12/2013 07:20 PM, Magnus Pettersson wrote: > >> You don't show the code that actually does the io.open(), nor the >> >> url.encode, so I'm not going to guess what you're actually doing. > > Hmm im not sure what you mean but I wrote all code needed in a previous post so maybe you missed that one :) > In short I basically just have: > import io > io.open(myfile,"a",encode="UTF-8") as f: > f.write(my_ustring_with_kanji) > > the url.encode() is my unicode string variable named "url" using the type built in function .encode() which was the thing i wondered why i needed to use, which you explained very well, thank you! > > Just one more question since all this is still a little fuzzy in my head. > > When do i need to use .decode() in my code? is it when i read lines from f.ex a UTF-8 file? And why didn't I have to use .encode() on my unicode string when running from within eclipse pydev? someone wrote that it has a default codec setting so maybe that handles it for me there (which is kinda dangerous since my programs wont work running outside of eclipse since i didnt do any encoding or using of unicode strings before in my script and it still worked) > decode goes from bytes to unicode, the exact reverse. And you're right, you'd need it on input from a file, and theoretically on input from a keyboard. Conceptually, the easiest (not necessarily the fastest) thing to do is to always convert any input that comes in byte form to unicode, immediately on getting it. Then all processing in the code should be done in unicode form. And you encode any output just before it goes out to a byte-device. Python 3 makes that a natural, as the string type is already unicode, and it's byte strings that are the exception. But all that really changes is the syntax you use. There are defaults all over the place on these conversions. And apparently, your IDE sets those defaults for you, which is a nasty thing, since it means things that run in the IDE will run differently outside of it. You're just lucky the difference was an error. If there weren't an error, you might have merrily been creating files with a mixture of encodings, which is a real disaster. One other place where decoding happens is in your source file. There is an optional encoding line you can place at the top of the file (immediately after the shebang line) to change how unicode literals with non-ASCII characters are interpreted. Remember your source file is a byte file edited with some text editor, and it has been encoded, deliberately or accidentally by that editor. You can avoid the issue by always using escape sequences, but if for example, you copy/paste some unicode string from an email message into your source code, you'd like it to be equivalent. If your email program, your text editor, and your Python compiler are all on the same page, it works amazingly simply. (That encoding line may affect other things; I know in Python 3, it makes non-ASCII attribute names possible, but I'm not sure if it matters in Python 2.x other than for unicode literal strings) -- DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 02:43 -0800
Re: UnicodeEncodeError when not running script from IDE Andrew Berg <bahamutzero8825@gmail.com> - 2013-02-12 05:01 -0600
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 06:24 -0800
Re: UnicodeEncodeError when not running script from IDE Peter Otten <__peter__@web.de> - 2013-02-12 15:49 +0100
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 07:29 -0800
Re: UnicodeEncodeError when not running script from IDE Peter Otten <__peter__@web.de> - 2013-02-12 16:48 +0100
Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 10:58 -0500
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 09:12 -0800
Re: UnicodeEncodeError when not running script from IDE Fabio Zadrozny <fabiofz@gmail.com> - 2013-02-12 18:04 -0200
Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 15:51 -0500
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 16:20 -0800
Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 22:51 -0500
Re: UnicodeEncodeError when not running script from IDE Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-13 11:21 +1100
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 16:40 -0800
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 07:29 -0800
Re: UnicodeEncodeError when not running script from IDE MRAB <python@mrabarnett.plus.com> - 2013-02-12 21:03 +0000
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 06:24 -0800
Re: UnicodeEncodeError when not running script from IDE Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-12 22:43 +1100
Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 04:34 -0800
Re: UnicodeEncodeError when not running script from IDE Terry Reedy <tjreedy@udel.edu> - 2013-02-12 11:07 -0500
csiph-web