Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #70672 > unrolled thread
| Started by | Robin Becker <robin@reportlab.com> |
|---|---|
| First post | 2014-04-28 10:47 +0100 |
| Last post | 2014-04-28 14:06 +0100 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: possible bug in re expression? Robin Becker <robin@reportlab.com> - 2014-04-28 10:47 +0100
Re: possible bug in re expression? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-04-28 11:49 +0000
Re: possible bug in re expression? Robin Becker <robin@reportlab.com> - 2014-04-28 14:06 +0100
| From | Robin Becker <robin@reportlab.com> |
|---|---|
| Date | 2014-04-28 10:47 +0100 |
| Subject | Re: possible bug in re expression? |
| Message-ID | <mailman.9545.1398678492.18130.python-list@python.org> |
On 25/04/2014 19:32, Terry Reedy wrote:
..........
> I suppose that one could argue that '{' alone should be treated as special
> immediately, and not just when a matching '}' is found, and should disable other
> special meanings. I wonder what JS does if there is no matching '}'?
>
well in fact I suspect this is my mistranslation of the original
new RegExp('.{1,' + (+size) + '}', 'g')
my hacked up translator doesn't know what that means. I suspect that (+size) is
an attempt to force size to an integer prior to it being forced to a string. I
used to believe that conversion was always written 0-x, but experimentally
(+"3") ends up as 3 not "3".
Naively, I imagined that re would complain about ambiguous regular expressions,
but in the regexp world n problems --> n+1 problems almost surely so I should
have anticipated it :)
Does this in fact that almost any broken regexp specification will silently fail
because re will reset and consider any metacharacter as literal?
--
Robin Becker
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-04-28 11:49 +0000 |
| Message-ID | <535e4037$0$29965$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #70672 |
On Mon, 28 Apr 2014 10:47:54 +0100, Robin Becker wrote:
> Does this in fact that almost any broken regexp specification will
> silently fail because re will reset and consider any metacharacter as
> literal?
Well, I don't know about "almost any", but at least some broken regexes
will explicitly fail:
py> import re
py> re.search('*', "123*4")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.3/re.py", line 161, in search
return _compile(pattern, flags).search(string)
[...]
File "/usr/local/lib/python3.3/sre_parse.py", line 552, in _parse
raise error("nothing to repeat")
sre_constants.error: nothing to repeat
(For brevity I have abbreviated the traceback.)
--
Steven D'Aprano
http://import-that.dreamwidth.org/
[toc] | [prev] | [next] | [standalone]
| From | Robin Becker <robin@reportlab.com> |
|---|---|
| Date | 2014-04-28 14:06 +0100 |
| Message-ID | <mailman.9548.1398690389.18130.python-list@python.org> |
| In reply to | #70675 |
On 28/04/2014 12:49, Steven D'Aprano wrote: ...... > > Well, I don't know about "almost any", but at least some broken regexes > will explicitly fail: > > > > py> import re ........ > sre_constants.error: nothing to repeat > > (For brevity I have abbreviated the traceback.) > so there is intent to catch some specification errors. I've abandoned this translation anyhow as all that was intended was to split the string into non-overlapping strings of size at most k. I find this works faster than the regexp even if the regexp is pre-compiled. [p[i:i+k] for i in xrange(0,len(p),k)] -- Robin Becker
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web