Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.058 X-Spam-Evidence: '*H*': 0.88; '*S*': 0.00; 'matches': 0.07; 'string': 0.09; 'underscore': 0.09; 'empty,': 0.16; 'subject:Regular': 0.16; 'subject:expression': 0.16; 'substring': 0.16; 'followed': 0.16; 'wrote:': 0.18; 'trying': 0.19; 'properly': 0.19; 'things.': 0.19; 'seems': 0.21; 'accepted.': 0.22; 'this?': 0.23; 'form:': 0.24; 'string,': 0.24; 'why.': 0.24; 'initial': 0.24; 'mon,': 0.24; 'this:': 0.26; 'second': 0.26; 'gets': 0.27; 'header:In-Reply- To:1': 0.27; 'tried': 0.27; 'rest': 0.29; 'characters': 0.30; 'said,': 0.30; 'message-id:@mail.gmail.com': 0.30; 'post.': 0.31; 'second,': 0.31; 'work:': 0.31; 'regular': 0.32; 'beginning': 0.33; 'third': 0.33; 'table': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'next': 0.36; 'too': 0.37; 'starting': 0.37; 'jason': 0.38; 'whatever': 0.38; 'to:addr :python-list': 0.38; 'pm,': 0.38; 'does': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'expression': 0.60; 'entire': 0.61; 'simply': 0.61; "you're": 0.61; 'first': 0.61; 'telling': 0.64; 'more': 0.64; 'forward': 0.65; 'anything.': 0.68; 'skip:r 30': 0.69; 'jul': 0.74; 'assertion.': 0.84; 'capture': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=9Ok5m4UTkBN4Tsq7nYBkwl/q9E6Oc60Ou5WpB/iiyOo=; b=QZ5ebBkYMTjUzye114anAcObx2XwHz6d4Ks/TKQ4PVT/ilYlv/twsNwitD0inTvdPb PTmuUJiEVF3hatg4H046ljOPJ/C0N32jM/pzwTFobKAO5xeWPJ4te96LsazFlZWDoHen Zg+yAJthaZP+1Qp0zMnDvZ3tpNRtY3uLeSTo/wq1+EbDUmt9i5wpuYvGOHcYiyfvE/Ft C9lIO7p3A5DePkyb4cY3yvhn6yOFPlf+I5I/wibd0vQc4c2T9u5JFiUhycSp+c8gtS2Z qafer1vSjdAkWXemwPdQly6+Fem/xYTPSIyZ5PfMDhYOnhqHubUwPpsaROqUBVOxbQyd t3og== X-Received: by 10.52.16.77 with SMTP id e13mr9010118vdd.49.1372743912745; Mon, 01 Jul 2013 22:45:12 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Ian Kelly Date: Mon, 1 Jul 2013 23:44:31 -0600 Subject: Re: Regular expression negative look-ahead To: Python Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 63 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1372743915 news.xs4all.nl 15948 [2001:888:2000:d::a6]:58612 X-Complaints-To: abuse@xs4all.nl Path: csiph.com!usenet.pasdenom.info!news.franciliens.net!feed.ac-versailles.fr!usenet-fr.net!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Xref: csiph.com comp.lang.python:49611 On Mon, Jul 1, 2013 at 8:27 PM, Jason Friedman wrote: > Found this: > http://stackoverflow.com/questions/13871833/negative-lookahead-assertion-not-working-in-python. > > This pattern seems to work: > pattern = re.compile(r"^(?!.*(CTL|DEL|RUN))") > > But I am not sure why. > > > On Mon, Jul 1, 2013 at 5:07 PM, Jason Friedman wrote: >> >> I have table names in this form: >> MY_TABLE >> MY_TABLE_CTL >> MY_TABLE_DEL >> MY_TABLE_RUN >> YOUR_TABLE >> YOUR_TABLE_CTL >> YOUR_TABLE_DEL >> YOUR_TABLE_RUN >> >> I am trying to create a regular expression that will return true for only >> these tables: >> MY_TABLE >> YOUR_TABLE >> >> I tried these: >> pattern = re.compile(r"_(?!(CTL|DEL|RUN))") >> pattern = re.compile(r"\w+(?!(CTL|DEL|RUN))") >> pattern = re.compile(r"(?!(CTL|DEL|RUN)$)") >> >> But, both match. >> I do not need to capture anything. For some reason I don't seem to have a copy of your initial post. The reason that regex works is because you're anchoring it at the start of the string and then telling it to match only if ".*(CTL|DEL|RUN)" /doesn't/ match. That pattern does match starting from the beginning of the string, so the pattern as a whole does not match. The reason that the other three do not work is because the forward assertions are not properly anchored. The first one can match the first underscore in "MY_TABLE_CTL" instead of the second, and then the next three characters are "TAB", not any of the verboten strings, so it matches. The second one matches any substring of "MY_TABLE_CTL" that isn't followed by "CTL". So it will just match the entire string "MY_TABLE_CTL", and the rest of the string is then empty, so does not match any of those three strings, so it too gets accepted. The third one simply matches an empty string that isn't followed by one of those three, so it will just match at the very start of the string and see that the next three characters meet the forward assertion. Now, all that said, are you sure you actually need a regular expression for this? It seems to me that you're overcomplicating things. Since you don't need to capture anything, your need can be met more simply with: if not table_name.endswith(('_CTL', '_DEL', '_RUN')): # Do whatever