Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.149 X-Spam-Level: * X-Spam-Evidence: '*H*': 0.71; '*S*': 0.01; 'output': 0.05; 'subject:Python': 0.06; 'octets': 0.16; 'presume': 0.16; 'exception': 0.16; 'all.': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'result.': 0.19; "i've": 0.25; '(see': 0.26; 'header:In-Reply- To:1': 0.27; 'wondering': 0.29; 'raise': 0.29; 'subject:) ': 0.29; 'characters': 0.30; 'dec': 0.30; 'message-id:@mail.gmail.com': 0.30; 'url:wiki': 0.31; 'url:wikipedia': 0.31; 'file': 0.32; 'subject: (': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'subject:?': 0.36; 'url:org': 0.36; 'to:addr :python-list': 0.38; 'pm,': 0.38; 'anything': 0.39; 'to:addr:python.org': 0.39; 'read': 0.60; 'full': 0.61; "you're": 0.61; 'show': 0.63; 'skip:n 10': 0.64; 'map': 0.64; 'more': 0.64; 'anything.': 0.68; 'batchelder': 0.84; 'noise': 0.84; 'subject:Movie': 0.84; 'gaps': 0.93; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=HpO03nV4Ew79ISNdzKy0uM5vj7hIDaNAaMXzcn6aj4Q=; b=jUSV7jH9dUqoyLCSDDkW5hT+6oV/lzj+4dGYqZo8EKlNmj1at7j/lW1jI72xF/9KK1 PZ0LwteFvS0nrWSJ1ixSKmYlAFrgqHrDVrpW8eYkt9pipqljSnqzP+3JFhnRoyj+Kc/y ayFDv9DyamMQPNzXPZAo2tQ9TPCU9q++Zi44K843XsyTtng4tdqXnDlTZDrcOmK9IIQR tDirOBlDZLCABxxvltsqSU2dX5NKqjePPWmRV0qwLUrN6aqXnFD7TwSbC4Xl6P4G/Fer iWseVOrqrstm1C2kCRzgyZy+sWNz5KjZHUZhnrE4gROcvpswkH6DVShPNhsKR9e3fEL+ k0eg== X-Received: by 10.52.227.233 with SMTP id sd9mr1383423vdc.53.1386812295972; Wed, 11 Dec 2013 17:38:15 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <2CDEC558-B896-402C-8F70-A3089A5EC93D@gmail.com> <52a8f410$0$29992$c3e8da3$5496439d@news.astraweb.com> From: Ian Kelly Date: Wed, 11 Dec 2013 18:37:34 -0700 Subject: Re: Movie (MPAA) ratings and Python? To: Python Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 15 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1386812304 news.xs4all.nl 2880 [2001:888:2000:d::a6]:41894 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:61641 On Wed, Dec 11, 2013 at 6:01 PM, Ned Batchelder wrote: >> I've also been wondering if ISO-8859-1 is just an octet-oriented codec, >> so it'll read about anything. There are clearly non-7-bit-ASCII >> characters in the file that look like line noise in an mrxvt. > > > Both ISO-8859-1 and Windows-1255 are octet-oriented, I don't see why one > would raise an exception when the other didn't. Unless the exception isn't > on the decode, but instead on your attempt to output the result. Can you > show the full traceback you're seeing? There are gaps in CP 1255 (see http://en.wikipedia.org/wiki/Code_page_1255), so I presume the file contains one or more of those octets that don't map to anything at all.