Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #49798

Re: Regular expression negative look-ahead

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <jsf80238@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.013
X-Spam-Evidence '*H*': 0.97; '*S*': 0.00; 'url:sourceforge': 0.03; 'expressions': 0.07; 'matches': 0.07; 'suppose': 0.07; 'string': 0.09; 'underscore': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'bug': 0.12; '&gt;&gt;': 0.16; 'empty,': 0.16; 'subject:Regular': 0.16; 'subject:expression': 0.16; 'substring': 0.16; '\xc2\xa0i': 0.16; 'followed': 0.16; 'wrote:': 0.18; 'trying': 0.19; 'properly': 0.19; "python's": 0.19; 'things.': 0.19; 'seems': 0.21; 'accepted.': 0.22; 'email addr:gmail.com&gt;': 0.22; 'cc:addr:python.org': 0.22; 'this?': 0.23; 'form:': 0.24; 'string,': 0.24; 'why.': 0.24; 'initial': 0.24; 'mon,': 0.24; 'cc:2**0': 0.24; '&gt;': 0.26; 'this:': 0.26; 'second': 0.26; 'gets': 0.27; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; 'rest': 0.29; 'characters': 0.30; 'said,': 0.30; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'code': 0.31; 'post.': 0.31; 'produces': 0.31; 'second,': 0.31; 'work:': 0.31; 'allows': 0.31; 'regular': 0.32; 'open': 0.33; 'url:python': 0.33; 'beginning': 0.33; 'third': 0.33; 'table': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'url:listinfo': 0.36; 'next': 0.36; "i'll": 0.36; 'url:org': 0.36; 'too': 0.37; 'list.': 0.37; 'starting': 0.37; 'skip:& 10': 0.38; 'jason': 0.38; 'whatever': 0.38; 'pm,': 0.38; 'skip:& 20': 0.39; 'does': 0.39; 'realize': 0.39; 'sure': 0.39; 'either': 0.39; 'url:mail': 0.40; 'expression': 0.60; 'ian': 0.60; 'entire': 0.61; 'simply': 0.61; "you're": 0.61; 'first': 0.61; 'telling': 0.64; 'more': 0.64; 'different': 0.65; 'forward': 0.65; 'anything.': 0.68; 'skip:r 40': 0.68; 'results': 0.69; 'skip:r 30': 0.69; 'jul': 0.74; '11:44': 0.84; 'assertion.': 0.84; 'capture': 0.91; 'to:none': 0.92; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=XCJ+AxigF2wJ48J1Jc6zXbJ7rsY5HpoYwNI+avDhK0I=; b=iE+QqpdrkuUQXpe0L6rs9chqi+BBrm25lVPguwPW8cXV06acpBRYcUnXMfli7c5T6e m4QRYbP5rCkCVfX5TwWzWTIhhTrbXBx9yZhnptvmflM9kA4Lqv7dlXOXBewg9MeM/QBz w5ihY2iGqKDahXijxLj5oZt93JLzhvpt0PlT7F/wP2LfGSf+rWGVGrfB5SeN7x1mEqEu UeZD8PFFv6n9TIlHncFHYfj021uObjW58rQDOCHj61dWGjyI+rvqtbdLoGM0qDDCnFCL yDdVrWxuor7hYSensIrSDQtAXAgLlHKskHq0iKhXBmMKOkTxJDR9oeumV/0CrMs0l+Qi Xsgw==
MIME-Version 1.0
X-Received by 10.43.72.9 with SMTP id ym9mr1660233icb.102.1372906163179; Wed, 03 Jul 2013 19:49:23 -0700 (PDT)
In-Reply-To <CALwzidnpEdA3wdXYbOjQrK_VOUDvEwCVbfPyHKBMich9w7bs8g@mail.gmail.com>
References <CANy1k1iGTYjAVzTnNegXuW9FaoqZsXe1z8TMhzAN1VYeyDUnSQ@mail.gmail.com> <CANy1k1jGBgLj45ngn6qNd8Okx_tGnXdWQo0+4yTsNr=XdxUSXg@mail.gmail.com> <CALwzidnpEdA3wdXYbOjQrK_VOUDvEwCVbfPyHKBMich9w7bs8g@mail.gmail.com>
Date Wed, 3 Jul 2013 20:49:23 -0600
Subject Re: Regular expression negative look-ahead
From Jason Friedman <jsf80238@gmail.com>
Cc Python <python-list@python.org>
Content-Type multipart/alternative; boundary=001a11c1c8fa2a980204e0a6a056
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4199.1372906589.3114.python-list@python.org> (permalink)
Lines 198
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1372906589 news.xs4all.nl 15864 [2001:888:2000:d::a6]:49714
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:49798

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

Huh, did not realize that endswith takes a list.  I'll remember that in the
future.

This need is actually for http://schemaspy.sourceforge.net/, which allows
one to include only tables/views that match a pattern.

Either there is a bug in Schemaspy's code or Java's implementation of
regular expressions is different than Python's or there is a flaw in my
logic, because the pattern I verify using Python produces different results
when used with Schemaspy.  I suppose I'll open a bug there unless I can
find the aforementioned flaw.


On Mon, Jul 1, 2013 at 11:44 PM, Ian Kelly <ian.g.kelly@gmail.com> wrote:

> On Mon, Jul 1, 2013 at 8:27 PM, Jason Friedman <jsf80238@gmail.com> wrote:
> > Found this:
> >
> http://stackoverflow.com/questions/13871833/negative-lookahead-assertion-not-working-in-python
> .
> >
> > This pattern seems to work:
> > pattern = re.compile(r"^(?!.*(CTL|DEL|RUN))")
> >
> > But I am not sure why.
> >
> >
> > On Mon, Jul 1, 2013 at 5:07 PM, Jason Friedman <jsf80238@gmail.com>
> wrote:
> >>
> >> I have table names in this form:
> >> MY_TABLE
> >> MY_TABLE_CTL
> >> MY_TABLE_DEL
> >> MY_TABLE_RUN
> >> YOUR_TABLE
> >> YOUR_TABLE_CTL
> >> YOUR_TABLE_DEL
> >> YOUR_TABLE_RUN
> >>
> >> I am trying to create a regular expression that will return true for
> only
> >> these tables:
> >> MY_TABLE
> >> YOUR_TABLE
> >>
> >> I tried these:
> >> pattern = re.compile(r"_(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"\w+(?!(CTL|DEL|RUN))")
> >> pattern = re.compile(r"(?!(CTL|DEL|RUN)$)")
> >>
> >> But, both match.
> >> I do not need to capture anything.
>
>
> For some reason I don't seem to have a copy of your initial post.
>
> The reason that regex works is because you're anchoring it at the
> start of the string and then telling it to match only if
> ".*(CTL|DEL|RUN)" /doesn't/ match.  That pattern does match starting
> from the beginning of the string, so the pattern as a whole does not
> match.
>
> The reason that the other three do not work is because the forward
> assertions are not properly anchored.  The first one can match the
> first underscore in "MY_TABLE_CTL" instead of the second, and then the
> next three characters are "TAB", not any of the verboten strings, so
> it matches.  The second one matches any substring of "MY_TABLE_CTL"
> that isn't followed by "CTL".  So it will just match the entire string
> "MY_TABLE_CTL", and the rest of the string is then empty, so does not
> match any of those three strings, so it too gets accepted.  The third
> one simply matches an empty string that isn't followed by one of those
> three, so it will just match at the very start of the string and see
> that the next three characters meet the forward assertion.
>
> Now, all that said, are you sure you actually need a regular
> expression for this?  It seems to me that you're overcomplicating
> things.  Since you don't need to capture anything, your need can be
> met more simply with:
>
> if not table_name.endswith(('_CTL', '_DEL', '_RUN')):
>     # Do whatever
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Regular expression negative look-ahead Jason Friedman <jsf80238@gmail.com> - 2013-07-03 20:49 -0600

csiph-web