Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #64065

Re: Guessing the encoding from a BOM

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!ecngs!feeder2.ecngs.de!novso.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <ethan@stoneleaf.us>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'class,': 0.07; 'indicating': 0.07; 'string': 0.09; '"my': 0.09; '[1]:': 0.09; 'categories.': 0.09; 'exception,': 0.09; 'exception.': 0.09; 'from:addr:ethan': 0.09; 'from:addr:stoneleaf.us': 0.09; 'from:name:ethan furman': 0.09; 'message-id:@stoneleaf.us': 0.09; '~ethan~': 0.09; 'jan': 0.12; '+1.': 0.16; '[2].': 0.16; 'detected': 0.16; 'failure.': 0.16; 'finney': 0.16; 'grep': 0.16; 'subclass': 0.16; 'value"': 0.16; 'valueerror': 0.16; 'exception': 0.16; 'wrote:': 0.18; 'thu,': 0.19; 'starts': 0.20; 'header:User- Agent:1': 0.23; 'error': 0.23; 'sort': 0.25; 'least': 0.26; 'values': 0.27; 'header:In-Reply-To:1': 0.27; '[1]': 0.29; "doesn't": 0.30; '[2]': 0.30; 'nature': 0.30; 'went': 0.31; 'usually': 0.31; '(maybe': 0.31; "d'aprano": 0.31; 'informative': 0.31; 'raised': 0.31; 'skip:= 20': 0.31; 'steven': 0.31; 'file': 0.32; 'another': 0.32; 'url:python': 0.33; 'plain': 0.33; 'sources': 0.33; 'subject:the': 0.34; "i'd": 0.34; 'subject:from': 0.34; 'but': 0.35; 'really': 0.36; 'url:org': 0.36; 'ben': 0.38; 'url:library': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; 'how': 0.40; 'url:3': 0.61; 'received:173': 0.61; 'strictly': 0.61; 'choose': 0.64; 'more': 0.64; 'close': 0.67; 'sound': 0.68; 'catastrophic': 0.84; 'received:64.5': 0.84
Date Wed, 15 Jan 2014 23:29:15 -0800
From Ethan Furman <ethan@stoneleaf.us>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version 1.0
To python-list@python.org
Subject Re: Guessing the encoding from a BOM
References <52d74063$0$29970$c3e8da3$5496439d@news.astraweb.com> <mailman.5566.1389844041.18130.python-list@python.org> <52d78254$0$6599$c3e8da3$5496439d@news.astraweb.com>
In-Reply-To <52d78254$0$6599$c3e8da3$5496439d@news.astraweb.com>
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 8bit
X-AntiAbuse This header was added to track abuse, please include it with any abuse report
X-AntiAbuse Primary Hostname - gator3304.hostgator.com
X-AntiAbuse Original Domain - python.org
X-AntiAbuse Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse Sender Address Domain - stoneleaf.us
X-BWhitelist no
X-Source-IP 173.12.184.233
X-Source
X-Source-Args
X-Source-Dir
X-Source-Sender ([173.12.184.233]) [173.12.184.233]:59769
X-Source-Auth ethan+stoneleaf.us
X-Email-Count 1
X-Source-Cap dG9idWs7dG9idWs7Z2F0b3IzMzA0Lmhvc3RnYXRvci5jb20=
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5576.1389858692.18130.python-list@python.org> (permalink)
Lines 33
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1389858692 news.xs4all.nl 2882 [2001:888:2000:d::a6]:49366
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:64065

Show key headers only | View raw


On 01/15/2014 10:55 PM, Steven D'Aprano wrote:
> On Thu, 16 Jan 2014 14:47:00 +1100, Ben Finney wrote:
>>
>> +1. I'd like a custom exception class, sub-classed from ValueError.
>
> Why ValueError? It's not really a "invalid value" error, it's more "my
> heuristic isn't good enough" failure. (Maybe the file starts with another
> sort of BOM which I don't know about.)
>
> If I go with an exception, I'd choose RuntimeError, or a custom error
> that inherits directly from Exception.

 From the docs [1]:
============================

     exception RuntimeError

         Raised when an error is detected that doesn’t fall in any
         of the other categories. The associated value is a string
         indicating what precisely went wrong.

It doesn't sound like RuntimeError is any more informative than Exception or AssertionError, and to my mind at least is 
usually close to catastrophic in nature [2].

I'd say a ValueError subclass because, while not an strictly an error, it is values you don't know how to deal with. 
But either that or plain Exception, just not RuntimeError.

--
~Ethan~


[1] http://docs.python.org/3/library/exceptions.html#RuntimeError
[2] verified by a (very) brief grep of the sources

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Guessing the encoding from a BOM Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-16 02:13 +0000
  Re: Guessing the encoding from a BOM Ben Finney <ben+python@benfinney.id.au> - 2014-01-16 14:47 +1100
    Re: Guessing the encoding from a BOM Steven D'Aprano <steve@pearwood.info> - 2014-01-16 06:55 +0000
      Re: Guessing the encoding from a BOM Ethan Furman <ethan@stoneleaf.us> - 2014-01-15 23:29 -0800
  Re: Guessing the encoding from a BOM Chris Angelico <rosuav@gmail.com> - 2014-01-16 16:01 +1100
    Re: Guessing the encoding from a BOM Steven D'Aprano <steve@pearwood.info> - 2014-01-16 06:45 +0000
  Re: Guessing the encoding from a BOM Ethan Furman <ethan@stoneleaf.us> - 2014-01-15 21:40 -0800
  Re: Guessing the encoding from a BOM Björn Lindqvist <bjourne@gmail.com> - 2014-01-16 19:01 +0100
  Re: Guessing the encoding from a BOM Chris Angelico <rosuav@gmail.com> - 2014-01-17 05:06 +1100
  Re: Guessing the encoding from a BOM Tim Chase <python.list@tim.thechases.com> - 2014-01-16 12:50 -0600

csiph-web