Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #39317

Re: encoding error

Path csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'parsing': 0.07; 'python': 0.09; 'encode': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'terry': 0.09; 'subject:error': 0.11; 'yet.': 0.13; 'file,': 0.15; 'codec': 0.16; 'mark,': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'string': 0.17; 'wrote:': 0.17; 'bytes': 0.17; 'unicode': 0.17; 'jan': 0.18; '3.2': 0.22; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'used,': 0.27; 'header:X -Complaints-To:1': 0.28; 'decimal': 0.29; 'character': 0.29; 'url:python': 0.32; 'file': 0.32; 'to:addr:python-list': 0.33; 'that,': 0.34; 'version': 0.34; "can't": 0.34; 'done': 0.34; 'pm,': 0.35; 'received:org': 0.36; 'but': 0.36; 'url:org': 0.36; 'should': 0.36; 'subject:: ': 0.38; 'possible.': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'skip:u 10': 0.60; 'first': 0.61; 'email addr:gmail.com': 0.63; 'believe': 0.69; 'received:fios.verizon.net': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Terry Reedy <tjreedy@udel.edu>
Subject Re: encoding error
Date Wed, 20 Feb 2013 01:13:09 -0500
References <974651c6-c5b2-4fba-b733-67ec65ec733f@googlegroups.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host pool-173-75-251-66.phlapa.fios.verizon.net
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2
In-Reply-To <974651c6-c5b2-4fba-b733-67ec65ec733f@googlegroups.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.2086.1361340815.2939.python-list@python.org> (permalink)
Lines 18
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1361340815 news.xs4all.nl 6895 [2001:888:2000:d::a6]:40753
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:39317

Show key headers only | View raw


On 2/19/2013 8:07 PM, halagamal2009@gmail.com wrote:
> UnicodeEncodeError: 'decimal' codec can't encode character u'\ufeff'
> in position 0: invalid decimal Unicode string

I believe that is a byte-order mark, which should only be the first 2 
bytes in the file and which should be removed if you use the proper 
decoder when reading the file, before parsing it.

You did not say what version of Python you used, but I would use 3.3 or 
if not that, 3.2 if possible.
http://pypi.python.org/pypi/Whoosh/
claims that whoosh works with python 3.

Also, read about the basics of unicode if you have not done so yet.

-- 
Terry Jan Reedy

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

encoding error halagamal2009@gmail.com - 2013-02-19 17:07 -0800
  Re: encoding error Terry Reedy <tjreedy@udel.edu> - 2013-02-20 01:13 -0500

csiph-web