Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #4475

Re: codec for UTF-8 with BOM

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!weretis.net!feeder1.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From Peter Otten <__peter__@web.de>
Newsgroups comp.lang.python
Subject Re: codec for UTF-8 with BOM
Followup-To comp.lang.python
Date Mon, 02 May 2011 13:42:55 +0200
Organization None
Lines 46
Message-ID <ipm5bm$c51$1@solani.org> (permalink)
References <cn6298-us4.ln1@satorlaser.homedns.org> <mailman.1068.1304329675.9059.python-list@python.org> <vhd298-568.ln1@satorlaser.homedns.org>
Mime-Version 1.0
Content-Type text/plain; charset="UTF-8"
Content-Transfer-Encoding 8Bit
X-Trace solani.org 1304336566 12449 eJwNyMERwDAIA7CVSgC7GQfisv8IzUt3SofhMJCInJzWRUn3UzvwkH0jln2m3qWgnzF7qxfVwg8esRFe (2 May 2011 11:42:46 GMT)
X-Complaints-To abuse@news.solani.org
NNTP-Posting-Date Mon, 2 May 2011 11:42:46 +0000 (UTC)
X-User-ID eJwNx8EBwCAIA8CVJEIw44jK/iO097uYNJ50Bj06WrVXZ9l/OQfyHtFLZOHOdf0pASrfG5I3xg50G3ZqxokPOJUUmw==
Cancel-Lock sha1:1yWA11RG9+/SPeU9kGy9dnWvvms=
X-NNTP-Posting-Host eJwFwQkBAzEMAzBKeZ0rnNaL+UOY1AkHp9CoVst6X8fzTc+q7+VgnD/sK1zm3bzySJ1eG6xREaFz1kBpuAOj348uFm3/zkQadA==
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:4475

Followups directed to: comp.lang.python

Show key headers only | View raw


Ulrich Eckhardt wrote:

> Chris Rebert wrote:
>>> 3. The docs mention encodings.utf_8_sig, available since 2.5, but I
>>> can't locate that thing there either. What's going on here?
>> 
>> Works for me™:
>> Python 2.6.6 (r266:84292, Jan 12 2011, 13:35:00)
>> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> from encodings import utf_8_sig
>>>>>
> 
> This works for me, too. What I tried and what failed was
> 
>   import encodings
>   encodings.utf_8_sig
> 
> which raises an AttributeError or dir(encodings), which doesn't show the
> according element. If I do it your way, the encoding then shows up in the
> content of the module.
> 
> Apart from the encoding issue, I don't understand this behaviour. Is the
> module behaving badly or are my expectations simply flawed?

This is standard python package behaviour:

>>> import logging
>>> logging.handlers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'handlers'
>>> import logging.handlers
>>> logging.handlers
<module 'logging.handlers' from '/usr/lib/python2.6/logging/handlers.pyc'>

You wouldn't see the AttributeError only if encodings/__init__.py contained 
a line

from . import utf_8_sig

or similar. The most notable package that acts this way is probably os which 
eagerly imports a suitable path module depending on the platform.

As you cannot foresee which encodings are actually needed in a script it 
makes sense to omit a just-in-case import.

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar


Thread

codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 10:34 +0200
  Re: codec for UTF-8 with BOM Chris Rebert <clp2@rebertia.com> - 2011-05-02 02:47 -0700
    Re: codec for UTF-8 with BOM Ulrich Eckhardt <ulrich.eckhardt@dominolaser.com> - 2011-05-02 12:30 +0200
      Re: codec for UTF-8 with BOM Peter Otten <__peter__@web.de> - 2011-05-02 13:42 +0200

csiph-web