Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #5524 > unrolled thread
| Started by | Tracubik <affdfsdfdsfsd@b.com> |
|---|---|
| First post | 2011-05-16 16:25 +0000 |
| Last post | 2011-05-16 18:11 +0100 |
| Articles | 4 — 4 participants |
Back to article view | Back to comp.lang.python
regular expression i'm going crazy Tracubik <affdfsdfdsfsd@b.com> - 2011-05-16 16:25 +0000
Re: regular expression i'm going crazy Robert Kern <robert.kern@gmail.com> - 2011-05-16 11:51 -0500
Re: regular expression i'm going crazy Alexander Kapps <alex.kapps@web.de> - 2011-05-16 19:01 +0200
Re: regular expression i'm going crazy andy baxter <andy@earthsong.free-online.co.uk> - 2011-05-16 18:11 +0100
| From | Tracubik <affdfsdfdsfsd@b.com> |
|---|---|
| Date | 2011-05-16 16:25 +0000 |
| Subject | regular expression i'm going crazy |
| Message-ID | <4dd14fdb$0$18238$4fafbaef@reader2.news.tin.it> |
pls help me fixing this: import re s = "linka la baba" re_s = re.compile(r'(link|l)a' , re.IGNORECASE) print re_s.findall(s) output: ['link', 'l'] why? i want my re_s to find linka and la, he just find link and l and forget about the ending a. can anyone help me? trying the regular expression in redemo.py (program provided with python to explore the use of regular expression) i get what i want, so i guess re_s is ok, but it still fail... why? help! Nico
[toc] | [next] | [standalone]
| From | Robert Kern <robert.kern@gmail.com> |
|---|---|
| Date | 2011-05-16 11:51 -0500 |
| Message-ID | <mailman.1647.1305564724.9059.python-list@python.org> |
| In reply to | #5524 |
On 5/16/11 11:25 AM, Tracubik wrote: > pls help me fixing this: > > import re > s = "linka la baba" > re_s = re.compile(r'(link|l)a' , re.IGNORECASE) > > print re_s.findall(s) > > output: > ['link', 'l'] > > why? > i want my re_s to find linka and la, he just find link and l and forget > about the ending a. > > can anyone help me? trying the regular expression in redemo.py (program > provided with python to explore the use of regular expression) i get what > i want, so i guess re_s is ok, but it still fail... > why? The parentheses () create a capturing group, which specifies that the contents of the group should be extracted. See the "(...)" entry here: http://docs.python.org/library/re#regular-expression-syntax You can use the non-capturing version of parentheses if you want to just isolate the | from affecting the rest of the regex: """ (?:...) A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern. """ [~] |1> import re [~] |2> s = "linka la baba" [~] |3> re_s = re.compile(r'(?:link|l)a' , re.IGNORECASE) [~] |4> print re_s.findall(s) ['linka', 'la'] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
[toc] | [prev] | [next] | [standalone]
| From | Alexander Kapps <alex.kapps@web.de> |
|---|---|
| Date | 2011-05-16 19:01 +0200 |
| Message-ID | <mailman.1648.1305565658.9059.python-list@python.org> |
| In reply to | #5524 |
On 16.05.2011 18:25, Tracubik wrote: > pls help me fixing this: > > import re > s = "linka la baba" > re_s = re.compile(r'(link|l)a' , re.IGNORECASE) > > print re_s.findall(s) > > output: > ['link', 'l'] > > why? As the docs say: "If one or more groups are present in the pattern, return a list of groups;" http://docs.python.org/library/re.html?highlight=findall#re.findall > i want my re_s to find linka and la, he just find link and l and forget > about the ending a. Try with non-grouping parentheses: re_s = re.compile(r'(?:link|l)a' , re.IGNORECASE)
[toc] | [prev] | [next] | [standalone]
| From | andy baxter <andy@earthsong.free-online.co.uk> |
|---|---|
| Date | 2011-05-16 18:11 +0100 |
| Message-ID | <mailman.1649.1305565877.9059.python-list@python.org> |
| In reply to | #5524 |
On 16/05/11 17:25, Tracubik wrote:
> pls help me fixing this:
>
> import re
> s = "linka la baba"
> re_s = re.compile(r'(link|l)a' , re.IGNORECASE)
>
> print re_s.findall(s)
>
> output:
> ['link', 'l']
>
> why?
> i want my re_s to find linka and la, he just find link and l and forget
> about the ending a.
The round brackets define a 'capturing group'. I.e. when you do findall
it returns those elements in the string that match what's inside the
brackets. If you want to get linka and la, you need something like this:
>>> re_s = re.compile(r'((link|l)a)' , re.IGNORECASE)
>>> print re_s.findall(s)
[('linka', 'link'), ('la', 'l')]
Then just look at the first element in each of the tuples in the array
(which matches the outside set of brackets).
see:
http://www.regular-expressions.info/python.html
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web