Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #61633

Re: Movie (MPAA) ratings and Python?

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder7.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.020
X-Spam-Evidence '*H*': 0.96; '*S*': 0.00; 'output': 0.05; 'subject:Python': 0.06; 'dan': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'guess.': 0.16; 'guessing': 0.16; 'maintainers': 0.16; 'ought': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'exception': 0.16; 'so.': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'result.': 0.19; 'example': 0.22; 'header:User-Agent:1': 0.23; '(or': 0.24; "i've": 0.25; 'header:X -Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'wondering': 0.29; 'raise': 0.29; 'subject:) ': 0.29; 'characters': 0.30; 'dec': 0.30; "i'm": 0.30; "d'aprano": 0.31; 'steven': 0.31; 'file': 0.32; 'text': 0.33; 'guess': 0.33; 'subject: (': 0.35; 'except': 0.35; 'but': 0.35; 'there': 0.35; 'subject:?': 0.36; 'two': 0.37; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'that,': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'read': 0.60; 'games,': 0.60; 'most': 0.60; 'hope': 0.61; 'full': 0.61; "you're": 0.61; 'show': 0.63; 'skip:n 10': 0.64; 'anything.': 0.68; 'bulk': 0.74; 'noise': 0.84; 'rating.': 0.84; 'subject:Movie': 0.84; 'confidence': 0.95; '2013': 0.98
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Ned Batchelder <ned@nedbatchelder.com>
Subject Re: Movie (MPAA) ratings and Python?
Date Wed, 11 Dec 2013 20:01:44 -0500
References <CAGGBd_pn4Mv8R0QuueG093YR82a3+EM2Sy+mF5d1aJ1D25yoeA@mail.gmail.com> <2CDEC558-B896-402C-8F70-A3089A5EC93D@gmail.com> <CAGGBd_qOi=rahykpRwTFmXQMnKb4sLd-8T4jGYM6kF_R=9U29w@mail.gmail.com> <l8ab8p$m61$1@ger.gmane.org> <mailman.3936.1386803257.18130.python-list@python.org> <52a8f410$0$29992$c3e8da3$5496439d@news.astraweb.com> <CAGGBd_rSWKHb_BVprf2zL+GaNyvBjjQor7-i6W0t4JndMx+V4Q@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host 18.189.43.80
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.1.1
In-Reply-To <CAGGBd_rSWKHb_BVprf2zL+GaNyvBjjQor7-i6W0t4JndMx+V4Q@mail.gmail.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3943.1386810121.18130.python-list@python.org> (permalink)
Lines 44
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1386810121 news.xs4all.nl 2864 [2001:888:2000:d::a6]:48271
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:61633

Show key headers only | View raw


On 12/11/13 6:39 PM, Dan Stromberg wrote:
>
> On Wed, Dec 11, 2013 at 3:24 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info
> <mailto:steve+comp.lang.python@pearwood.info>> wrote:
>
>     On Wed, 11 Dec 2013 15:07:35 -0800, Dan Stromberg wrote:
>
>      >  $ chardet mpaa-ratings-reasons.list
>      > mpaa-ratings-reasons.list: windows-1255 (confidence: 0.97)
>      >
>      > I'm aware that chardet is playing guessing games, though one
>     would hope
>      > it would guess well most of the time, and give a reasonable
>     confidence
>      > rating.
>
>     What reason do you have for thinking that Windows-1255 isn't a
>     reasonable
>     guess? If the bulk of the text is Latin-1 except perhaps for one or two
>     Hebrew characters (or what chardet thinks are Hebrew characters), it may
>     actually be a reasonable guess.
>
>
> I get a traceback if I try to read the file as Windows-1255.  I don't
> get a traceback if I read it as ISO-8859-1.
>
>     If it is a poor guess, perhaps you ought to report it to the chardet
>     maintainers as a good example of a poor guess.
>
> I was considering that, and may do so.
>
> I've also been wondering if ISO-8859-1 is just an octet-oriented codec,
> so it'll read about anything.  There are clearly non-7-bit-ASCII
> characters in the file that look like line noise in an mrxvt.

Both ISO-8859-1 and Windows-1255 are octet-oriented, I don't see why one 
would raise an exception when the other didn't.  Unless the exception 
isn't on the decode, but instead on your attempt to output the result. 
Can you show the full traceback you're seeing?

-- 
Ned Batchelder, http://nedbatchelder.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re: Movie (MPAA) ratings and Python? Dan Stromberg <drsalists@gmail.com> - 2013-12-11 15:07 -0800
  Re: Movie (MPAA) ratings and Python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-11 23:24 +0000
    Re: Movie (MPAA) ratings and Python? Dan Stromberg <drsalists@gmail.com> - 2013-12-11 15:39 -0800
    Re: Movie (MPAA) ratings and Python? Ned Batchelder <ned@nedbatchelder.com> - 2013-12-11 20:01 -0500
    Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) Ben Finney <ben+python@benfinney.id.au> - 2013-12-12 12:12 +1100
      Re: Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) rusi <rustompmody@gmail.com> - 2013-12-11 19:23 -0800
        Re: Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) Chris Angelico <rosuav@gmail.com> - 2013-12-12 15:27 +1100
        Re: Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-12-12 11:05 +0000
          Re: Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) Steve Hayes <hayesstw@telkomsa.net> - 2013-12-12 15:36 +0200
    Re: Disable HTML in forum messages (was: Movie (MPAA) ratings and Python?) Ian Kelly <ian.g.kelly@gmail.com> - 2013-12-11 18:31 -0700
    Re: Movie (MPAA) ratings and Python? Ian Kelly <ian.g.kelly@gmail.com> - 2013-12-11 18:37 -0700
    Re: Movie (MPAA) ratings and Python? Dan Stromberg <drsalists@gmail.com> - 2013-12-11 19:52 -0800
    Re: Movie (MPAA) ratings and Python? Michael Torrie <torriem@gmail.com> - 2013-12-11 23:22 -0700
    Re: Movie (MPAA) ratings and Python? Dave Angel <davea@davea.name> - 2013-12-12 08:56 -0500

csiph-web