Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'broken': 0.03; 'cpython': 0.05; 'tests,': 0.07; 'python': 0.09; 'broke': 0.09; 'modules.': 0.09; 'preserve': 0.09; 'terry': 0.09; 'url:github': 0.09; 'sfxlen:2': 0.10; 'suggest': 0.11; 'commit': 0.15; '(plain': 0.16; '3.3,': 0.16; '3.3.': 0.16; 'bugs,': 0.16; 'command:': 0.16; 'reedy': 0.16; 'script?': 0.16; 'subject:3.3': 0.16; 'subject:between': 0.16; 'thoughts?': 0.16; 'url:whatsnew': 0.16; 'wrote:': 0.17; 'unicode': 0.17; '3.2': 0.22; 'affected': 0.22; 'context.': 0.22; 'somebody': 0.23; 'to:2**1': 0.23; 'seems': 0.23; 'url:bugs': 0.24; 'least': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'bugs': 0.27; "doesn't": 0.28; 'went': 0.28; 'run': 0.28; 'end,': 0.29; 'unified': 0.29; 'related': 0.30; 'that.': 0.30; 'url:python': 0.32; 'anybody': 0.32; 'switch': 0.32; 'to:no real name:2**1': 0.32; 'could': 0.32; 'builds': 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; '(with': 0.33; 'changed': 0.34; 'so,': 0.35; 'pm,': 0.35; 'something': 0.35; 'there': 0.35; 'really': 0.36; 'but': 0.36; 'url:org': 0.36; 'modules': 0.36; 'method': 0.36; 'should': 0.36; 'subject: (': 0.36; 'possible': 0.37; 'does': 0.37; 'two': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'url:docs': 0.38; 'received:10': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'content-disposition:inline': 0.60; 'skip:u 10': 0.60; 'latest': 0.61; 'wide': 0.62; 'between': 0.63; 'more': 0.63; 'decided': 0.65; '(is': 0.84; 'p.s.:': 0.84; 'url:cpython': 0.84; 'url:rev': 0.84; 'received:10.36': 0.91 Date: Wed, 6 Mar 2013 14:07:52 +0100 From: Matej Cepl To: python-list@python.org, python-devel@python.org Subject: Re: Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: User-Agent: slrn/1.0.1 (Linux) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 60 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1362575282 news.xs4all.nl 6875 [2001:888:2000:d::a6]:38215 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:40632 On 2013-02-26, 16:25 GMT, Terry Reedy wrote: > On 2/21/2013 4:22 PM, Matej Cepl wrote: >> as my method to commemorate Aaron Swartz, I have decided to port his >> html2text to work fully with the latest python 3.3. After some time >> dealing with various bugs, I have now in my repo >> https://github.com/mcepl/html2text (branch python3) working solution >> which works all the way to python 3.2 (inclusive; >> https://travis-ci.org/mcepl/html2text). However, the last problem >> remains. This >> >>
  • Run this command: >>
    ls -l *.html
  • >>
  • ?
  • >> >> should lead to >> >> * Run this command: >> >> ls -l *.html >> >> * ? >> >> but it doesn=E2=80=99t. It leads to this (with python 3.3 only) >> >> * Run this command: >> ls -l *.html >> >> * ? >> >> Does anybody know about something which changed in modules re or >> http://docs.python.org/3.3/whatsnew/changelog.html between 3.2 and=20 >> 3.3, which could influence this script? > > Search the changelob or 3.3 misc/News for items affecting those two=20 > modules. There are at least 4. > http://docs.python.org/3.3/whatsnew/changelog.html > > It is faintly possible that the switch from narrow/wide builds to=20 > unified builds somehow affected that. Have you tested with 2.7/3.2 on=20 > both narrow and wide unicode builds? So, in the end, I have went the long way and bisected cpython to=20 find the commit which broke my tests, and it seems that the=20 culprit is http://hg.python.org/cpython/rev/123f2dc08b3e so it is=20 clearly something Unicode related. Unfortunately, it really doesn't tell me what exactly is broken=20 (is it a known regression) and if there is known workaround. =20 Could anybody suggest a way how to find bugs on=20 http://bugs.python.org related to some particular commit (plain=20 search for 123f2dc0 didn=E2=80=99t find anything). Any thoughts? Mat=C4=9Bj P.S.: Crossposting to python-devel in hope there would be=20 somebody understanding more about that particular commit. For=20 that I have also intentionally not trim the original messages to=20 preserve context.