Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #72107 > unrolled thread
| Started by | Aman Kashyap <amankashyap1223@gmail.com> |
|---|---|
| First post | 2014-05-27 03:59 -0700 |
| Last post | 2014-05-27 14:06 +0100 |
| Articles | 8 — 6 participants |
Back to article view | Back to comp.lang.python
Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 03:59 -0700
Re: Regular Expression for the special character "|" pipe Vlastimil Brom <vlastimil.brom@gmail.com> - 2014-05-27 13:09 +0200
Re: Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 04:20 -0700
Re: Regular Expression for the special character "|" pipe Daniel <5960761@gmail.com> - 2014-05-27 14:29 +0300
Re: Regular Expression for the special character "|" pipe Aman Kashyap <amankashyap1223@gmail.com> - 2014-05-27 04:39 -0700
Re: Regular Expression for the special character "|" pipe Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2014-05-27 13:55 +0200
Re: Regular Expression for the special character "|" pipe Roy Smith <roy@panix.com> - 2014-05-27 08:35 -0400
Re: Regular Expression for the special character "|" pipe Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-05-27 14:06 +0100
| From | Aman Kashyap <amankashyap1223@gmail.com> |
|---|---|
| Date | 2014-05-27 03:59 -0700 |
| Subject | Regular Expression for the special character "|" pipe |
| Message-ID | <9c8e58be-9619-44c7-8098-961a0134c422@googlegroups.com> |
I would like to create a regular expression in which i can match the "|" special character too. e.g. start=|ID=ter54rt543d|SID=ter54rt543d|end=| I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too. By default python treat "|" as an OR operator. But in my case I want to use to as a part of search string.
[toc] | [next] | [standalone]
| From | Vlastimil Brom <vlastimil.brom@gmail.com> |
|---|---|
| Date | 2014-05-27 13:09 +0200 |
| Message-ID | <mailman.10368.1401188967.18130.python-list@python.org> |
| In reply to | #72107 |
2014-05-27 12:59 GMT+02:00 Aman Kashyap <amankashyap1223@gmail.com>: > I would like to create a regular expression in which i can match the "|" special character too. > > e.g. > > start=|ID=ter54rt543d|SID=ter54rt543d|end=| > > I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too. > > By default python treat "|" as an OR operator. > > But in my case I want to use to as a part of search string. > -- Hi, you can just escpape the pipe with backlash like any other metacharacter: r"start=\|ID=ter54rt543d" be sure to use the raw string notation r"...", or you can double all backslashes in the string. hth, vbr
[toc] | [prev] | [next] | [standalone]
| From | Aman Kashyap <amankashyap1223@gmail.com> |
|---|---|
| Date | 2014-05-27 04:20 -0700 |
| Message-ID | <e0a96d65-95a1-401a-9335-ab2e24397896@googlegroups.com> |
| In reply to | #72110 |
On Tuesday, 27 May 2014 16:39:19 UTC+5:30, Vlastimil Brom wrote: > 2014-05-27 12:59 GMT+02:00 Aman Kashyap <amankashyap1223@gmail.com>: > > > I would like to create a regular expression in which i can match the "|" special character too. > > > > > > e.g. > > > > > > start=|ID=ter54rt543d|SID=ter54rt543d|end=| > > > > > > I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too. > > > > > > By default python treat "|" as an OR operator. > > > > > > But in my case I want to use to as a part of search string. > > > -- > > > > Hi, > > you can just escpape the pipe with backlash like any other metacharacter: > > > > r"start=\|ID=ter54rt543d" > > > > be sure to use the raw string notation r"...", or you can double all > > backslashes in the string. > > > > hth, > > vbr Thanks vbr for the quick response. I have string = |SOH=|ID=re65dgt5dd|DS=fjkjf|SDID=fhkhkf|ID=fkjfkf|EOM=| and want to replace 2 sub-strings |ID=re65dgt5dd| with |ID=MAN| |ID=fkjfkf| with |MAN| I am using regular expression ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*|$ the output is |SOH=|ID=MAN|DS=fjkjf|SDID=MAN|ID=MAN|EOM=|ID=MAN expected value is = |SOH=|ID=MAN|DS=fjkjf|SDID=fhkhkf|ID=MAN|EOM=| could you please help me in this regard?
[toc] | [prev] | [next] | [standalone]
| From | Daniel <5960761@gmail.com> |
|---|---|
| Date | 2014-05-27 14:29 +0300 |
| Message-ID | <mailman.10369.1401190577.18130.python-list@python.org> |
| In reply to | #72107 |
What about skipping the re and try this:
'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]
On 27.05.2014 14:09, Vlastimil Brom wrote:
> 2014-05-27 12:59 GMT+02:00 Aman Kashyap <amankashyap1223@gmail.com>:
>> I would like to create a regular expression in which i can match the "|" special character too.
>>
>> e.g.
>>
>> start=|ID=ter54rt543d|SID=ter54rt543d|end=|
>>
>> I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.
>>
>> By default python treat "|" as an OR operator.
>>
>> But in my case I want to use to as a part of search string.
>> --
> Hi,
> you can just escpape the pipe with backlash like any other metacharacter:
>
> r"start=\|ID=ter54rt543d"
>
> be sure to use the raw string notation r"...", or you can double all
> backslashes in the string.
>
> hth,
> vbr
[toc] | [prev] | [next] | [standalone]
| From | Aman Kashyap <amankashyap1223@gmail.com> |
|---|---|
| Date | 2014-05-27 04:39 -0700 |
| Message-ID | <3cc77455-39ed-4403-a46c-5dd8e640a483@googlegroups.com> |
| In reply to | #72112 |
On Tuesday, 27 May 2014 16:59:38 UTC+5:30, Daniel wrote:
> What about skipping the re and try this:
>
>
>
> 'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]
>
>
>
> On 27.05.2014 14:09, Vlastimil Brom wrote:
>
> > 2014-05-27 12:59 GMT+02:00 Aman Kashyap <amankashyap1223@gmail.com>:
>
> >> I would like to create a regular expression in which i can match the "|" special character too.
>
> >>
>
> >> e.g.
>
> >>
>
> >> start=|ID=ter54rt543d|SID=ter54rt543d|end=|
>
> >>
>
> >> I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.
>
> >>
>
> >> By default python treat "|" as an OR operator.
>
> >>
>
> >> But in my case I want to use to as a part of search string.
>
> >> --
>
> > Hi,
>
> > you can just escpape the pipe with backlash like any other metacharacter:
>
> >
>
> > r"start=\|ID=ter54rt543d"
>
> >
>
> > be sure to use the raw string notation r"...", or you can double all
>
> > backslashes in the string.
>
> >
>
> > hth,
>
> > vbr
Thanks for the response.
I got the answer finally.
This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|
[toc] | [prev] | [next] | [standalone]
| From | Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> |
|---|---|
| Date | 2014-05-27 13:55 +0200 |
| Message-ID | <mailman.10370.1401191774.18130.python-list@python.org> |
| In reply to | #72113 |
On 27.05.2014 13:39, Aman Kashyap wrote: >> On 27.05.2014 14:09, Vlastimil Brom wrote: >> >>> you can just escpape the pipe with backlash like any other metacharacter: >>> >>> r"start=\|ID=ter54rt543d" >>> >>> be sure to use the raw string notation r"...", or you can double all >> >>> backslashes in the string. >> > Thanks for the response. > > I got the answer finally. > > This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\| > or, and more readable: r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|' This is what Vlastimil was talking about. It saves you from having to escape the backslashes.
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-05-27 08:35 -0400 |
| Message-ID | <roy-ED3A2B.08355327052014@news.panix.com> |
| In reply to | #72114 |
In article <mailman.10370.1401191774.18130.python-list@python.org>,
Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
> On 27.05.2014 13:39, Aman Kashyap wrote:
> >> On 27.05.2014 14:09, Vlastimil Brom wrote:
> >>
> >>> you can just escpape the pipe with backlash like any other metacharacter:
> >>>
> >>> r"start=\|ID=ter54rt543d"
> >>>
> >>> be sure to use the raw string notation r"...", or you can double all
> >>
> >>> backslashes in the string.
> >>
> > Thanks for the response.
> >
> > I got the answer finally.
> >
> > This is the regular expression to be
> > used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|
> >
>
> or, and more readable:
>
> r'\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\|'
>
> This is what Vlastimil was talking about. It saves you from having to
> escape the backslashes.
Sometimes what I do, instead of using backslashes, I put the problem
character into a character class by itself. It's a matter of personal
opinion which way is easier to read, but it certainly eliminates all the
questions about "how many backslashes do I need?"
> r'[|]ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*[|]'
Another thing that can help make regexes easier to read is the VERBOSE
flag. Basically, it ignores whitespace inside the regex (see
https://docs.python.org/2/library/re.html#module-contents for details).
So, you can write something like:
pattern = re.compile(r'''[|]
ID=
[a-z]*
[0-9]*
[a-z]*
[0-9]*
[a-z]*
[|]''',
re.VERBOSE)
Or, alternatively, take advantage of the fact that Python concatenates
adjacent string literals, and write it like this:
pattern = re.compile(r'[|]'
r'ID='
r'[a-z]*'
r'[0-9]*'
r'[a-z]*'
r'[0-9]*'
r'[a-z]*'
r'[|]'
)
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-05-27 14:06 +0100 |
| Message-ID | <mailman.10371.1401195996.18130.python-list@python.org> |
| In reply to | #72113 |
On 27/05/2014 12:39, Aman Kashyap wrote:
> On Tuesday, 27 May 2014 16:59:38 UTC+5:30, Daniel wrote:
>> What about skipping the re and try this:
>>
>>
>>
>> 'start=|ID=ter54rt543d|SID=ter54rt543d|end=|'.split('|')[1][3:]
>>
>>
>>
>> On 27.05.2014 14:09, Vlastimil Brom wrote:
>>
>>> 2014-05-27 12:59 GMT+02:00 Aman Kashyap <amankashyap1223@gmail.com>:
>>
>>>> I would like to create a regular expression in which i can match the "|" special character too.
>>
>>>>
>>
>>>> e.g.
>>
>>>>
>>
>>>> start=|ID=ter54rt543d|SID=ter54rt543d|end=|
>>
>>>>
>>
>>>> I want to only |ID=ter54rt543d| from the above string but i am unable to write the pattern match containing "|" pipe too.
>>
>>>>
>>
>>>> By default python treat "|" as an OR operator.
>>
>>>>
>>
>>>> But in my case I want to use to as a part of search string.
>>
>>>> --
>>
>>> Hi,
>>
>>> you can just escpape the pipe with backlash like any other metacharacter:
>>
>>>
>>
>>> r"start=\|ID=ter54rt543d"
>>
>>>
>>
>>> be sure to use the raw string notation r"...", or you can double all
>>
>>> backslashes in the string.
>>
>>>
>>
>>> hth,
>>
>>> vbr
>
> Thanks for the response.
>
> I got the answer finally.
>
> This is the regular expression to be used:\\|ID=[a-z]*[0-9]*[a-z]*[0-9]*[a-z]*\\|
>
I'm pleased to see that you have answers. In return would you please
use the mailing list
https://mail.python.org/mailman/listinfo/python-list or read and action
this https://wiki.python.org/moin/GoogleGroupsPython to prevent us
seeing double line spacing and single line paragraphs, thanks.
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web