Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #39958 > unrolled thread

Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)

Started byMatej Cepl <mcepl@redhat.com>
First post2013-02-21 22:22 +0100
Last post2013-03-06 14:07 +0100
Articles 3 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) Matej Cepl <mcepl@redhat.com> - 2013-02-21 22:22 +0100
    Re: Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) Terry Reedy <tjreedy@udel.edu> - 2013-02-26 11:25 -0500
      Re: Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) Matej Cepl <mcepl@redhat.com> - 2013-03-06 14:07 +0100

#39958 — Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)

FromMatej Cepl <mcepl@redhat.com>
Date2013-02-21 22:22 +0100
SubjectDifference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
Message-ID<slrnkid41e.u21.mcepl@wycliff.ceplovi.cz>
Hi,

as my method to commemorate Aaron Swartz, I have decided to port his 
html2text to work fully with the latest python 3.3. After some time 
dealing with various bugs, I have now in my repo 
https://github.com/mcepl/html2text (branch python3) working solution 
which works all the way to python 3.2 (inclusive; 
https://travis-ci.org/mcepl/html2text). However, the last problem 
remains. This

<li>Run this command:
<pre>ls -l *.html</pre></li>
<li>?</li>

should lead to

  * Run this command: 
    
        ls -l *.html

  * ?

but it doesn’t. It leads to this (with python 3.3 only)

    * Run this command: 
          ls -l *.html
  
    * ?

Does anybody know about something which changed in modules re or 
html.parser between 3.2 and 3.3, which could influence this script?

Thanks,

Matěj Cepl

[toc] | [next] | [standalone]


#39965

FromTerry Reedy <tjreedy@udel.edu>
Date2013-02-26 11:25 -0500
Message-ID<mailman.2549.1361895946.2939.python-list@python.org>
In reply to#39958
On 2/21/2013 4:22 PM, Matej Cepl wrote:
> as my method to commemorate Aaron Swartz, I have decided to port his
> html2text to work fully with the latest python 3.3. After some time
> dealing with various bugs, I have now in my repo
> https://github.com/mcepl/html2text (branch python3) working solution
> which works all the way to python 3.2 (inclusive;
> https://travis-ci.org/mcepl/html2text). However, the last problem
> remains. This
>
> <li>Run this command:
> <pre>ls -l *.html</pre></li>
> <li>?</li>
>
> should lead to
>
>    * Run this command:
>
>          ls -l *.html
>
>    * ?
>
> but it doesn’t. It leads to this (with python 3.3 only)
>
>      * Run this command:
>            ls -l *.html
>
>      * ?
>
> Does anybody know about something which changed in modules re or
> http://docs.python.org/3.3/whatsnew/changelog.html between 3.2 and 3.3, which could influence this script?

Search the changelob or 3.3 misc/News for items affecting those two 
modules. There are at least 4.
http://docs.python.org/3.3/whatsnew/changelog.html

It is faintly possible that the switch from narrow/wide builds to 
unified builds somehow affected that. Have you tested with 2.7/3.2 on 
both narrow and wide unicode builds?

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#40632

FromMatej Cepl <mcepl@redhat.com>
Date2013-03-06 14:07 +0100
Message-ID<mailman.2942.1362575282.2939.python-list@python.org>
In reply to#39965
On 2013-02-26, 16:25 GMT, Terry Reedy wrote:
> On 2/21/2013 4:22 PM, Matej Cepl wrote:
>> as my method to commemorate Aaron Swartz, I have decided to port his
>> html2text to work fully with the latest python 3.3. After some time
>> dealing with various bugs, I have now in my repo
>> https://github.com/mcepl/html2text (branch python3) working solution
>> which works all the way to python 3.2 (inclusive;
>> https://travis-ci.org/mcepl/html2text). However, the last problem
>> remains. This
>>
>> <li>Run this command:
>> <pre>ls -l *.html</pre></li>
>> <li>?</li>
>>
>> should lead to
>>
>>    * Run this command:
>>
>>          ls -l *.html
>>
>>    * ?
>>
>> but it doesn’t. It leads to this (with python 3.3 only)
>>
>>      * Run this command:
>>            ls -l *.html
>>
>>      * ?
>>
>> Does anybody know about something which changed in modules re or
>> http://docs.python.org/3.3/whatsnew/changelog.html between 3.2 and 
>> 3.3, which could influence this script?
>
> Search the changelob or 3.3 misc/News for items affecting those two 
> modules. There are at least 4.
> http://docs.python.org/3.3/whatsnew/changelog.html
>
> It is faintly possible that the switch from narrow/wide builds to 
> unified builds somehow affected that. Have you tested with 2.7/3.2 on 
> both narrow and wide unicode builds?

So, in the end, I have went the long way and bisected cpython to 
find the commit which broke my tests, and it seems that the 
culprit is http://hg.python.org/cpython/rev/123f2dc08b3e so it is 
clearly something Unicode related.

Unfortunately, it really doesn't tell me what exactly is broken 
(is it a known regression) and if there is known workaround.  
Could anybody suggest a way how to find bugs on 
http://bugs.python.org related to some particular commit (plain 
search for 123f2dc0 didn’t find anything).

Any thoughts?

Matěj

P.S.: Crossposting to python-devel in hope there would be 
somebody understanding more about that particular commit. For 
that I have also intentionally not trim the original messages to 
preserve context.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web