Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #49798

Re: Regular expression negative look-ahead

References <CANy1k1iGTYjAVzTnNegXuW9FaoqZsXe1z8TMhzAN1VYeyDUnSQ@mail.gmail.com> <CANy1k1jGBgLj45ngn6qNd8Okx_tGnXdWQo0+4yTsNr=XdxUSXg@mail.gmail.com> <CALwzidnpEdA3wdXYbOjQrK_VOUDvEwCVbfPyHKBMich9w7bs8g@mail.gmail.com>
Date 2013-07-03 20:49 -0600
Subject Re: Regular expression negative look-ahead
From Jason Friedman <jsf80238@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.4199.1372906589.3114.python-list@python.org> (permalink)

Show all headers | View raw


[Multipart message — attachments visible in raw view] - view raw

Huh, did not realize that endswith takes a list.  I'll remember that in the
future.

This need is actually for http://schemaspy.sourceforge.net/, which allows
one to include only tables/views that match a pattern.

Either there is a bug in Schemaspy's code or Java's implementation of
regular expressions is different than Python's or there is a flaw in my
logic, because the pattern I verify using Python produces different results
when used with Schemaspy.  I suppose I'll open a bug there unless I can
find the aforementioned flaw.


On Mon, Jul 1, 2013 at 11:44 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:

> On Mon, Jul 1, 2013 at 8:27 PM, Jason Friedman <jsf80238@gmail.com> wrote:
> > Found this:
> >
> http://stackoverflow.com/questions/13871833/negative-lookahead-assertion-not-working-in-python
> .
> >
> > This pattern seems to work:
> > pattern = re.compile(r"^(?!.*(CTL|DEL|RUN))")
> >
> > But I am not sure why.
> >
> >
> > On Mon, Jul 1, 2013 at 5:07 PM, Jason Friedman <jsf80238@gmail.com>
> wrote:
> >>
> >> I have table names in this form:
> >> MY_TABLE
> >> MY_TABLE_CTL
> >> MY_TABLE_DEL
> >> MY_TABLE_RUN
> >> YOUR_TABLE
> >> YOUR_TABLE_CTL
> >> YOUR_TABLE_DEL
> >> YOUR_TABLE_RUN
> >>
> >> I am trying to create a regular expression that will return true for
> only
> >> these tables:
> >> MY_TABLE
> >> YOUR_TABLE
> >>
> >> I tried these:
> >> pattern = re.compile(r"_(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"\w+(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"(?!(CTL|DEL|RUN)$)")
> >>
> >> But, both match.
> >> I do not need to capture anything.
>
>
> For some reason I don't seem to have a copy of your initial post.
>
> The reason that regex works is because you're anchoring it at the
> start of the string and then telling it to match only if
> ".*(CTL|DEL|RUN)" /doesn't/ match.  That pattern does match starting
> from the beginning of the string, so the pattern as a whole does not
> match.
>
> The reason that the other three do not work is because the forward
> assertions are not properly anchored.  The first one can match the
> first underscore in "MY_TABLE_CTL" instead of the second, and then the
> next three characters are "TAB", not any of the verboten strings, so
> it matches.  The second one matches any substring of "MY_TABLE_CTL"
> that isn't followed by "CTL".  So it will just match the entire string
> "MY_TABLE_CTL", and the rest of the string is then empty, so does not
> match any of those three strings, so it too gets accepted.  The third
> one simply matches an empty string that isn't followed by one of those
> three, so it will just match at the very start of the string and see
> that the next three characters meet the forward assertion.
>
> Now, all that said, are you sure you actually need a regular
> expression for this?  It seems to me that you're overcomplicating
> things.  Since you don't need to capture anything, your need can be
> met more simply with:
>
> if not table_name.endswith(('_CTL', '_DEL', '_RUN')):
>     # Do whatever
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Regular expression negative look-ahead Jason Friedman <jsf80238@gmail.com> - 2013-07-03 20:49 -0600

csiph-web