Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!weretis.net!feeder1.news.weretis.net!feeder4.news.weretis.net!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.013 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'elif': 0.04; 'bug.': 0.07; 'hettinger': 0.07; 'python': 0.08; 'subject:parsing': 0.09; 'output': 0.11; '+++': 0.13; 'am,': 0.13; 'wrote:': 0.15; '-16,6': 0.16; 'curly': 0.16; 'participate.': 0.16; 'quote.': 0.16; 'received:192.168.1.40': 0.16; 'stack:': 0.16; 'url:pastebin': 0.16; '>>>': 0.16; 'def': 0.16; 'raymond': 0.19; 'seems': 0.20; "doesn't": 0.22; 'header:In-Reply-To:1': 0.22; 'sunday,': 0.23; 'code': 0.24; 'stack': 0.25; 'thanks.': 0.26; '+0200': 0.28; 'lee': 0.28; 'script': 0.29; '---': 0.30; 'seem': 0.31; 'error': 0.31; 'file.': 0.32; 'done': 0.33; 'to:addr:python-list': 0.34; 'header:User-Agent:1': 0.34; 'it?': 0.34; 'there': 0.34; 'post': 0.34; '17,': 0.35; 'installed': 0.35; 'file': 0.36; 'received:192': 0.38; 'url:org': 0.38; 'subject:: ': 0.38; 'run': 0.39; 'received:192.168.1': 0.39; "there's": 0.39; 'to:addr:python.org': 0.39; 'skip:. 10': 0.40; 'hope': 0.60; 'double': 0.62; 'url:p': 0.63; 'july': 0.64; 'here.': 0.66; 'received:62': 0.67; 'subjectcharset:utf-8': 0.72; 'skip:c 50': 0.77; 'spot': 0.78; '12:47': 0.84; 'from:addr:t': 0.84; 'url:xahlee': 0.84; 'xah': 0.84; 'subject:little': 0.91 Date: Tue, 19 Jul 2011 20:07:14 +0200 From: Thomas Jollans User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110626 Iceowl/1.0b2 Icedove/3.1.11 MIME-Version: 1.0 To: python-list@python.org Subject: Re: a little parsing challenge =?UTF-8?B?4pi6?= References: <36037253-086b-4467-a1db-9492d3772e78@r5g2000prf.googlegroups.com> <99245842-e205-4a34-8f9d-c64d41e044b6@j9g2000prj.googlegroups.com> <2a67a8cd-d3f0-4af9-9105-6551344b0277@t8g2000prm.googlegroups.com> In-Reply-To: <2a67a8cd-d3f0-4af9-9105-6551344b0277@t8g2000prm.googlegroups.com> X-Enigmail-Version: 1.1.2 OpenPGP: id=5C8691ED Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 39 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1311098837 news.xs4all.nl 23931 [2001:888:2000:d::a6]:45706 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:9904 On 19/07/11 18:54, Xah Lee wrote: > On Sunday, July 17, 2011 2:48:42 AM UTC-7, Raymond Hettinger wrote: >> On Jul 17, 12:47 am, Xah Lee wrote: >>> i hope you'll participate. Just post solution here. Thanks. >> >> http://pastebin.com/7hU20NNL > > just installed py3. > there seems to be a bug. > in this file > > http://xahlee.org/p/time_machine/tm-ch04.html > > there's a mismatched double curly quote. at position 28319. > > the python code above doesn't seem to spot it? > > here's the elisp script output when run on that dir: > > Error file: c:/Users/h3/web/xahlee_org/p/time_machine/tm-ch04.html > ["“" 28319] > Done deal! That script doesn't check that the balance is zero at the end of file. Patch: --- ../xah-raymond-old.py 2011-07-19 20:05:13.000000000 +0200 +++ ../xah-raymond.py 2011-07-19 20:03:14.000000000 +0200 @@ -16,6 +16,8 @@ elif c in closers: if not stack or c != stack.pop(): return i + if stack: + return i return -1 def scan(directory, encoding='utf-8'):