Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #72115

Re: Regular Expression for the special character "|" pipe

From Roy Smith <roy@panix.com>
Newsgroups comp.lang.python
Subject Re: Regular Expression for the special character "|" pipe
Date 2014-05-27 08:35 -0400
Organization PANIX Public Access Internet and UNIX, NYC
Message-ID <roy-ED3A2B.08355327052014@news.panix.com> (permalink)
References <9c8e58be-9619-44c7-8098-961a0134c422@googlegroups.com> <CAHzaPEPqruhi57cK5BQt0roRRkqKOfgAdM5Yoe-stM+Ta5FB9w@mail.gmail.com> <mailman.10369.1401190577.18130.python-list@python.org> <3cc77455-39ed-4403-a46c-5dd8e640a483@googlegroups.com> <mailman.10370.1401191774.18130.python-list@python.org>

Show all headers | View raw


In article <mailman.10370.1401191774.18130.python-list@python.org>,
 Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:

> On 27.05.2014 13:39, Aman Kashyap wrote:
> >> On 27.05.2014 14:09, Vlastimil Brom wrote:
> >>
> >>> you can just escpape the pipe with backlash like any other metacharacter:
> >>>
> >>> r"start=\|ID=ter54rt543d"
> >>>
> >>> be sure to use the raw string notation r"...", or you can double all
> >>
> >>> backslashes in the string.
> >>
> > Thanks for the response.
> >
> > I got the answer finally.
> >
> > This is the regular expression to be 
> > used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|
> >
> 
> or, and more readable:
> 
> r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|'
> 
> This is what Vlastimil was talking about. It saves you from having to 
> escape the backslashes.

Sometimes what I do, instead of using backslashes, I put the problem 
character into a character class by itself.  It's a matter of personal 
opinion which way is easier to read, but it certainly eliminates all the 
questions about "how many backslashes do I need?"

> r'[|]ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*[|]'

Another thing that can help make regexes easier to read is the VERBOSE 
flag.  Basically, it ignores whitespace inside the regex (see 
https://docs.python.org/2/library/re.html#module-contents for details).  
So, you can write something like:

pattern = re.compile(r'''[|]
                         ID=
                         [a-z]*
                         [0-9]*
                         [a-z]*
                         [0-9]*
                         [a-z]*
                         [|]''',
                     re.VERBOSE)

Or, alternatively, take advantage of the fact that Python concatenates 
adjacent string literals, and write it like this:

pattern = re.compile(r'[|]'
                     r'ID='
                     r'[a-z]*'
                     r'[0-9]*'
                     r'[a-z]*'
                     r'[0-9]*'
                     r'[a-z]*'
                     r'[|]'
                    )

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 03:59 -0700
  Re: Regular Expression for the special character "|" pipe Vlastimil Brom <vlastimil.brom@gmail.com> - 2014-05-27 13:09 +0200
    Re: Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 04:20 -0700
  Re: Regular Expression for the special character "|" pipe Daniel <5960761@gmail.com> - 2014-05-27 14:29 +0300
    Re: Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 04:39 -0700
      Re: Regular Expression for the special character "|" pipe Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2014-05-27 13:55 +0200
        Re: Regular Expression for the special character "|" pipe Roy Smith <roy@panix.com> - 2014-05-27 08:35 -0400
      Re: Regular Expression for the special character "|" pipe Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-27 14:06 +0100

csiph-web