Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #38761

Re: UnicodeEncodeError when not running script from IDE

Path csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <magpettersson@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.003
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'except:': 0.07; 'try:': 0.07; 'utf-8': 0.07; 'encode': 0.09; 'mode,': 0.09; 'open()': 0.09; 'portable': 0.09; 'subject:script': 0.09; 'to:addr:comp.lang.python': 0.09; 'work"': 0.09; 'cc:addr:python- list': 0.10; 'def': 0.10; 'subject:not': 0.11; 'encoding': 0.15; '"a")': 0.16; '#test': 0.16; 'codec': 0.16; 'eclipse': 0.16; 'encodings': 0.16; 'subject:when': 0.16; 'string': 0.17; 'specify': 0.17; 'changes': 0.20; 'skip:" 30': 0.20; 'written': 0.20; 'import': 0.21; 'default,': 0.22; 'cc:2**0': 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In- Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'received:209.85.212': 0.28; 'fine': 0.28; 'prints': 0.29; 'skip:k 30': 0.29; 'character': 0.29; 'error': 0.30; 'code': 0.31; 'file': 0.32; 'could': 0.32; 'print': 0.32; 'problem': 0.33; "can't": 0.34; 'changed': 0.34; 'received:google.com': 0.34; 'data,': 0.35; 'doing': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'skip:u 20': 0.36; 'but': 0.36; 'flow': 0.36; 'does': 0.37; 'uses': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'behind': 0.38; 'mean': 0.38; 'skip:o 20': 0.38; 'some': 0.38; 'sure': 0.38; 'page': 0.38; 'instead': 0.39; 'help': 0.40; 'skip:u 10': 0.60; 'containing': 0.61; 'here:': 0.62; 'letters': 0.62; 'card': 0.62; 'dont': 0.64; 'here': 0.65; 'webpage': 0.65; 'skip:c 50': 0.66; 'now:': 0.71; '.....': 0.75; 'subject:running': 0.84; 'terrible': 0.84; 'scenes': 0.91; 'scraping': 0.91
X-Received by 10.49.38.194 with SMTP id i2mr1286960qek.30.1360682940911; Tue, 12 Feb 2013 07:29:00 -0800 (PST)
Newsgroups comp.lang.python
Date Tue, 12 Feb 2013 07:29:00 -0800 (PST)
In-Reply-To <mailman.1700.1360680572.2939.python-list@python.org>
Complaints-To groups-abuse@google.com
Injection-Info glegroupsg2000goo.googlegroups.com; posting-host=46.9.253.222; posting-account=rbrw_goAAADkxBdp_kDLn3mjmxW9-buk
References <650d144e-da3d-4ca7-ad3a-49f44ce9cbaa@googlegroups.com> <mailman.1696.1360666894.2939.python-list@python.org> <0d6d513d-fa12-4d51-a33d-7bb38f1ee6b2@googlegroups.com> <mailman.1700.1360680572.2939.python-list@python.org>
User-Agent G2/1.0
X-Google-Web-Client true
X-Google-IP 46.9.253.222
MIME-Version 1.0
Subject Re: UnicodeEncodeError when not running script from IDE
From Magnus Pettersson <magpettersson@gmail.com>
To comp.lang.python@googlegroups.com
Content-Type text/plain; charset=ISO-8859-1
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Message-ID <mailman.1706.1360682950.2939.python-list@python.org> (permalink)
Lines 72
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1360682950 news.xs4all.nl 6935 [2001:888:2000:d::a6]:36347
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:38761

Show key headers only | View raw


> Are you sure you are writing the same data? That would mean that pydev 
> 
> changes the default encoding -- which is evil.
> 
> 
> 
> A portable approach would be to use codecs.open() or io.open() instead of 
> 
> the built-in:
> 
> 
> 
> import io
> 
> with io.open(filepath, "a") as f:
> 
>     ...
> 
> 
> 
> io.open() uses UTF-8 by default, but you can specify other encodings with
> 
> io.open(filepath, mode, encoding=whatever).


Interesting. Pydev must be doing something behind the scenes because when i changed open() to io.open() i get error inside of eclipse now:

f.write(card+"\n")
  File "C:\python27\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character u'\u53c8' in position 32: character maps to <undefined>

....

io.open(filepath, "a", encoding="UTF-8") as f: 

Then it works in eclipse. But I seem to be having an encoding problem all over the place that works in eclipse but dosnt work outside of eclipse pydev.

Here is the flow of my data, im terrible at using unicode/encode/decode so could use some help here:

kanji_anki_gui.py:

def on_addButton_clicked(self):
    #code
    # self.kanji.text() comes from a kanji letter written into a pyqt4 QLineEdit
    kanji = unicode(self.kanji.text())
    card = kanji_anki.scrapeKanji(kanji,tags)
    #more code

kanji_anki.py:

def scrapeKanji(kanji, tags="", onlymeaning=False):
    baseurl = unicode("http://www.romajidesu.com/kanji/")
    url = unicode(baseurl+kanji)
    #test to write out url to disk, works outside of eclipse now
    savefile([url])
    
    #getting webpage works fine in eclipse, prints "Oh no..." in terminal
    try:
        page = urllib2.urlopen(url)
    except:
        print "OH no website dont work"
	return None

    #Code that does some scraping and returns a string containing kanji letters
    return card

def savefile(cardlist,filepath="D:/iknow_kanji.txt"):
    with io.open(filepath, "a") as f:
        for card in cardlist:
            f.write(card+"\n")
    return True

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 02:43 -0800
  Re: UnicodeEncodeError when not running script from IDE Andrew Berg <bahamutzero8825@gmail.com> - 2013-02-12 05:01 -0600
    Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 06:24 -0800
      Re: UnicodeEncodeError when not running script from IDE Peter Otten <__peter__@web.de> - 2013-02-12 15:49 +0100
        Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 07:29 -0800
          Re: UnicodeEncodeError when not running script from IDE Peter Otten <__peter__@web.de> - 2013-02-12 16:48 +0100
          Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 10:58 -0500
            Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 09:12 -0800
              Re: UnicodeEncodeError when not running script from IDE Fabio Zadrozny <fabiofz@gmail.com> - 2013-02-12 18:04 -0200
              Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 15:51 -0500
                Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 16:20 -0800
                Re: UnicodeEncodeError when not running script from IDE Dave Angel <davea@davea.name> - 2013-02-12 22:51 -0500
              Re: UnicodeEncodeError when not running script from IDE Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-13 11:21 +1100
                Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 16:40 -0800
        Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 07:29 -0800
      Re: UnicodeEncodeError when not running script from IDE MRAB <python@mrabarnett.plus.com> - 2013-02-12 21:03 +0000
    Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 06:24 -0800
  Re: UnicodeEncodeError when not running script from IDE Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-02-12 22:43 +1100
    Re: UnicodeEncodeError when not running script from IDE Magnus Pettersson <magpettersson@gmail.com> - 2013-02-12 04:34 -0800
      Re: UnicodeEncodeError when not running script from IDE Terry Reedy <tjreedy@udel.edu> - 2013-02-12 11:07 -0500

csiph-web