Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #4475

Re: codec for UTF-8 with BOM

From Peter Otten <__peter__@web.de>
Newsgroups comp.lang.python
Subject Re: codec for UTF-8 with BOM
Followup-To comp.lang.python
Date 2011-05-02 13:42 +0200
Organization None
Message-ID <ipm5bm$c51$1@solani.org> (permalink)
References <cn6298-us4.ln1@satorlaser.homedns.org> <mailman.1068.1304329675.9059.python-list@python.org> <vhd298-568.ln1@satorlaser.homedns.org>

Followups directed to: comp.lang.python

Show all headers | View raw


Ulrich Eckhardt wrote:

> Chris Rebert wrote:
>>> 3. The docs mention encodings.utf_8_sig, available since 2.5, but I
>>> can't locate that thing there either. What's going on here?
>> 
>> Works for me™:
>> Python 2.6.6 (r266:84292, Jan 12 2011, 13:35:00)
>> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> from encodings import utf_8_sig
>>>>>
> 
> This works for me, too. What I tried and what failed was
> 
>   import encodings
>   encodings.utf_8_sig
> 
> which raises an AttributeError or dir(encodings), which doesn't show the
> according element. If I do it your way, the encoding then shows up in the
> content of the module.
> 
> Apart from the encoding issue, I don't understand this behaviour. Is the
> module behaving badly or are my expectations simply flawed?

This is standard python package behaviour:

>>> import logging
>>> logging.handlers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'handlers'
>>> import logging.handlers
>>> logging.handlers
<module 'logging.handlers' from '/usr/lib/python2.6/logging/handlers.pyc'>

You wouldn't see the AttributeError only if encodings/__init__.py contained 
a line

from . import utf_8_sig

or similar. The most notable package that acts this way is probably os which 
eagerly imports a suitable path module depending on the platform.

As you cannot foresee which encodings are actually needed in a script it 
makes sense to omit a just-in-case import.

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar


Thread

codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 10:34 +0200
  Re: codec for UTF-8 with BOM Chris Rebert <clp2@rebertia.com> - 2011-05-02 02:47 -0700
    Re: codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 12:30 +0200
      Re: codec for UTF-8 with BOM Peter Otten <__peter__@web.de> - 2011-05-02 13:42 +0200

csiph-web