Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #4463

codec for UTF-8 with BOM

From Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com>
Newsgroups comp.lang.python
Subject codec for UTF-8 with BOM
Followup-To comp.lang.python
Date 2011-05-02 10:34 +0200
Message-ID <cn6298-us4.ln1@satorlaser.homedns.org> (permalink)

Followups directed to: comp.lang.python

Show all headers | View raw


Hi!

I want to write a file starting with the BOM and using UTF-8, and stumbled 
across some problems:

1. I would have expected one of the codecs to be 'UTF-8 with BOM' or 
something like that, but I can't find the correct name. Also, I can't find a 
way to get a list of the supported codecs at all, which strikes me as odd.


2. I couldn't find a way to write the BOM either. Writing codecs.BOM doesn't 
work, as it is an already encoded byte string. Of course, I can write 
u'\ufeff', but I'd rather avoid such magic numbers in my code.


3. The docs mention encodings.utf_8_sig, available since 2.5, but I can't 
locate that thing there either. What's going on here?


What would you do?

Uli

-- 
Domino Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Back to comp.lang.python | Previous | NextNext in thread | Find similar


Thread

codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 10:34 +0200
  Re: codec for UTF-8 with BOM Chris Rebert <clp2@rebertia.com> - 2011-05-02 02:47 -0700
    Re: codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 12:30 +0200
      Re: codec for UTF-8 with BOM Peter Otten <__peter__@web.de> - 2011-05-02 13:42 +0200

csiph-web