Path: csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'output': 0.04; 'context': 0.05; 'processed': 0.05; 'differently': 0.07; 'filename': 0.07; 'main()': 0.07; 'python': 0.09; 'encode': 0.09; 'fails.': 0.09; 'modules.': 0.09; 'non-ascii': 0.09; 'normally,': 0.09; 'through,': 0.09; 'files.': 0.13; 'file,': 0.15; "'r',": 0.16; 'ascii,': 0.16; 'big,': 0.16; 'codec': 0.16; 'csv': 0.16; 'ordinal': 0.16; 'wrote:': 0.17; 'module,': 0.17; 'specify': 0.17; 'version.': 0.17; 'load': 0.19; 'causing': 0.20; 'written': 0.20; 'bit': 0.21; '2.x': 0.22; '3.x': 0.22; 'this:': 0.23; 'pass': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'skip:" 20': 0.26; '(most': 0.27; 'am,': 0.27; 'expanding': 0.27; 'converting': 0.27; 'there.': 0.28; '255,': 0.29; 'context,': 0.29; 'helpful.': 0.29; 'character': 0.29; 'probably': 0.29; "i'm": 0.29; 'error': 0.30; 'code': 0.31; 'file': 0.32; 'purposes,': 0.33; 'traceback': 0.33; 'to:addr:python-list': 0.33; 'version': 0.34; "can't": 0.34; 'data,': 0.35; 'doing': 0.35; 'something': 0.35; 'but': 0.36; "didn't": 0.36; 'test': 0.36; 'should': 0.36; 'uses': 0.37; 'being': 0.37; 'rather': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'skip:l 20': 0.38; 'some': 0.38; 'nothing': 0.38; 'to:addr:python.org': 0.39; 'received:192': 0.39; 'where': 0.40; 'received:192.168': 0.40; 'john': 0.60; 'skip:u 10': 0.60; 'further': 0.61; 'ever': 0.63; 'more': 0.63; 'taking': 0.65; 'records': 0.68; 'soon': 0.70; 'received:74.208': 0.71; '14:': 0.84; 'packaged': 0.84; 'received:74.208.4.194': 0.84; 'treats': 0.84; 'from.': 0.93 Date: Thu, 07 Mar 2013 08:10:17 -0500 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Unhelpful traceback References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:3h2Pz7zBnmfsAdrgqPXAo0rsstXdtKBz81WfBP5C/iB /kGwXxFA7ejOeq0uQNTtV1FA3okp9KajcjPdWd7D3lptTJLbF9 WdYX4yHvkUdAc4XkYAsI/B5yJl0CIXSIjxTIsDlCZrVIyucNts B6dPzNiD1DmZ/j6kxfth4GhorB7Z94Rfo5lcaXyNX2Xbn1IRqk zSBY/dlZBZHSMrSvjyXhbd0vv8UdxESOFP7mYPXb7CN0Z0z91w rtHWQ+aTFzT2vG3aXQIFjGoqCcx34hDKe5N6yTMNlgc+OlgqF8 rH++BTgSLUpr7FC4fodaXzo0yYpTYRkPwijyE8EgRUWXXtuKA= = X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 75 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1362661824 news.xs4all.nl 6954 [2001:888:2000:d::a6]:58237 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:40760 On 03/07/2013 01:33 AM, John Nagle wrote: > Here's a traceback that's not helping: > A bit more context would be helpful. Starting with Python version. > Traceback (most recent call last): > File "InfoCompaniesHouse.py", line 255, in > main() > File "InfoCompaniesHouse.py", line 251, in main > loader.dofile(infile) # load this file > File "InfoCompaniesHouse.py", line 213, in dofile > self.dofilezip(infilename) # do ZIP file > File "InfoCompaniesHouse.py", line 198, in dofilezip > self.dofilecsv(infile, infd) # as a CSV file > File "InfoCompaniesHouse.py", line 182, in dofilecsv > for fields in reader : # read entire > CSV file > UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in > position 14: ordinal not in range(128) > > This is wierd, becuase "for fields in reader" isn't directly > doing a decode. That's further down somewhere, and the backtrace > didn't tell me where. > > The program is converting some .CSV files that come packaged in .ZIP > files. The files are big, so rather than expanding them, they're > read directly from the ZIP files and processed through the ZIP > and CSV modules. > > Here's the code that's causing the error above: > > decoder = codecs.getreader('utf-8') > with decoder(infdraw,errors="replace") as infd : > with codecs.open(outfilename, encoding='utf-8', mode='w') as > outfd : > headerline = infd.readline() > self.doheaderline(headerline) > reader = csv.reader(infd, delimiter=',', quotechar='"') > for fields in reader : > pass > > Normally, the "pass" is a call to something that > uses the data, but for test purposes, I put a "pass" in there. It still > fails. With that "pass", nothing is ever written to the > output file, and no "encoding" should be taking place. > > "infdraw" is a stream from the zip module, create like this: > > with inzip.open(zipelt.filename,"r") as infd : You probably need a 'rb' rather than 'r', since the file is not ASCII. > self.dofilecsv(infile, infd) > > This works for data records that are pure ASCII, but as soon as some > non-ASCII character comes through, it fails. > > Where is the error being generated? I'm not seeing any place > where there's a conversion to ASCII. Not even a print. > > John Nagle > > > > If that isn't enough, then please give the whole context, such as where zipelt and filename came from. And don't forget to specify Python version. Version 3.x treats nonbinary files very differently than 2.x -- DaveA