Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #26035
| From | jaroslav.dobrek@gmail.com |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | catch UnicodeDecodeError |
| Date | 2012-07-25 04:05 -0700 |
| Organization | http://groups.google.com |
| Message-ID | <04f7ff8d-9881-4a04-ab2e-b5573b5f3cd1@googlegroups.com> (permalink) |
Hello,
very often I have the following problem: I write a program that processes many files which it assumes to be encoded in utf-8. Then, some day, I there is a non-utf-8 character in one of several hundred or thousand (new) files. The program exits with an error message like this:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 60: invalid continuation byte
I usually solve the problem by moving files around and by recoding them.
What I really want to do is use something like
try:
# open file, read line, or do something else, I don't care
except UnicodeDecodeError:
sys.exit("Found a bad char in file " + file + " line " + str(line_number)
Yet, no matter where I put this try-except, it doesn't work.
How should I use try-except with UnicodeDecodeError?
Jaroslav
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
catch UnicodeDecodeError jaroslav.dobrek@gmail.com - 2012-07-25 04:05 -0700
Re: catch UnicodeDecodeError Andrew Berg <bahamutzero8825@gmail.com> - 2012-07-25 06:34 -0500
Re: catch UnicodeDecodeError Philipp Hagemeister <phihag@phihag.de> - 2012-07-25 13:35 +0200
Re: catch UnicodeDecodeError jaroslav.dobrek@gmail.com - 2012-07-25 05:09 -0700
Re: catch UnicodeDecodeError jaroslav.dobrek@gmail.com - 2012-07-25 05:09 -0700
Re: catch UnicodeDecodeError Dave Angel <d@davea.name> - 2012-07-25 14:50 -0400
Re: catch UnicodeDecodeError Jaroslav Dobrek <jaroslav.dobrek@gmail.com> - 2012-07-26 00:46 -0700
Re: catch UnicodeDecodeError Stefan Behnel <stefan_ml@behnel.de> - 2012-07-26 10:28 +0200
Re: catch UnicodeDecodeError Jaroslav Dobrek <jaroslav.dobrek@gmail.com> - 2012-07-26 03:51 -0700
Re: catch UnicodeDecodeError Stefan Behnel <stefan_ml@behnel.de> - 2012-07-26 13:15 +0200
Re: catch UnicodeDecodeError jaroslav.dobrek@gmail.com - 2012-07-26 04:58 -0700
Re: catch UnicodeDecodeError jaroslav.dobrek@gmail.com - 2012-07-26 04:58 -0700
Re: catch UnicodeDecodeError Philipp Hagemeister <phihag@phihag.de> - 2012-07-26 14:17 +0200
Re: catch UnicodeDecodeError Stefan Behnel <stefan_ml@behnel.de> - 2012-07-26 14:24 +0200
Re: catch UnicodeDecodeError Chris Angelico <rosuav@gmail.com> - 2012-07-26 19:46 +1000
Re: catch UnicodeDecodeError wxjmfauth@gmail.com - 2012-07-26 03:19 -0700
Re: catch UnicodeDecodeError Philipp Hagemeister <phihag@phihag.de> - 2012-07-26 14:43 +0200
csiph-web