Path: csiph.com!usenet.pasdenom.info!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Peter Otten <__peter__@web.de>
Subject: Re: encoding error in python 27
Date: Fri, 22 Feb 2013 16:40:47 +0100
Organization: None
References: <a3d3d352-c170-4165-9552-741869106830@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit
User-Agent: KNode/4.7.3
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2276.1361547628.2939.python-list@python.org>
Lines: 19
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:39581

Hala Gamal wrote:

> my code works well with english file but when i use text file
> encodede"utf-8" "my file contain some arabic letters" it doesn't work. my
> code:

>   with codecs.open("tt.txt",encoding='utf-8') as txtfile:

Try encoding="utf-8-sig" in the above to remove the byte order mark (BOM) 
upon decoding, see

http://docs.python.org/2.7/library/codecs.html#module-encodings.utf_8_sig

That should prevent

> UnicodeEncodeError: 'decimal' codec can't encode character u'\ufeff' in
> position 0: invalid decimal Unicode string