Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.043 X-Spam-Evidence: '*H*': 0.91; '*S*': 0.00; 'encoding': 0.05; 'subject:Python': 0.06; 'mentioned,': 0.07; 'dan': 0.09; 'imdb': 0.16; 'imdbpy': 0.16; 'quite.': 0.16; 'to:name:python list': 0.16; 'wrote:': 0.18; 'appears': 0.22; 'handles': 0.22; 'nearly': 0.26; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'subject:) ': 0.29; 'dec': 0.30; 'along': 0.30; 'lines': 0.31; 'apparently': 0.31; 'yourself.': 0.31; 'file': 0.32; 'probably': 0.32; 'text': 0.33; 'subject: (': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'doing': 0.36; 'subject:?': 0.36; 'message-id:@gmail.com': 0.38; 'to:addr:python-list': 0.38; 'rather': 0.38; 'to:addr:python.org': 0.39; 'flat': 0.60; 'header :Message-Id:1': 0.63; 'charset:windows-1252': 0.65; 'us,': 0.73; 'issues;': 0.84; 'rated': 0.84; 'subject:Movie': 0.84; '2013,': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=zqlqEHaVIJhBZCr7LuZYHSGxpwrN+l+qxrwPAJt7/1s=; b=yOQ7JOkjWhr6LEvb5YzVSS/Grtt9LxaO/Rd2BfKTFXFmtMc9kJjQvJpr86/P5npPJp zAy1rFcyXFvF50zOSiyVqY766Zc5gE+W6Xo9vYZDrMOLl/E/9YE2a3SUyVH30Azf7H7w B4U2G/e0gtvstUkq3mYfASy24MbrWCENBQ5duW/ibg0b4LNSAqXQI6NdnbXcGAPd/fqH yTQcGW3mgk8grFiCyTPuq4V9MjvocjVZe7r2IZD9ULL/A8fllPZIoK77T8wscrXypGCJ 3tR1jUqNH0kkGadJ61/vRxFu3jvUx/lB8UC/koXfBEvKnuyrwjiYJmgIXr8Prl4CoTDg yYeA== X-Received: by 10.14.221.193 with SMTP id r41mr10643406eep.92.1386709698564; Tue, 10 Dec 2013 13:08:18 -0800 (PST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) Subject: Re: Movie (MPAA) ratings and Python? From: Petite Abeille In-Reply-To: Date: Tue, 10 Dec 2013 22:07:44 +0100 Content-Transfer-Encoding: quoted-printable References: To: Python List X-Mailer: Apple Mail (2.1822) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 50 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1386710066 news.xs4all.nl 2934 [2001:888:2000:d::a6]:44566 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:61501 On Dec 10, 2013, at 6:25 AM, Dan Stromberg wrote: > The IMDB flat text file probably came the closest, but it appears to = have encoding issues; it's apparently nearly windows-1255, but not = quite. It's ISO-8859-1. Both certificates.list.gz and mpaa-ratings-reasons.list.gz are rather = straightforward to parse. For the US, you will get something along these lines out of = certificates.list.gz: USA:(Banned) USA:12 USA:AO USA:Approved USA:C USA:E USA:E10+ USA:G USA:GP USA:K-A USA:M USA:M/PG USA:NC-17 USA:Not Rated USA:Open USA:PG USA:PG-13 USA:Passed USA:R USA:T USA:TV-14 USA:TV-G USA:TV-MA USA:TV-PG USA:TV-Y USA:TV-Y7 USA:Unrated USA:X And as mentioned, imdbpy handles all this out-of-the-box if you don=92t = feel like doing it yourself.