Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #10444
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Subject | Re: unexpected regexp behaviour using 'A|B|C.....' |
| Date | 2011-07-28 12:57 +0200 |
| Organization | None |
| References | <6c92b791-58d2-4ea5-8997-48ef21ce69f8@z7g2000vbp.googlegroups.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.1566.1311850649.1164.python-list@python.org> (permalink) |
AlienBaby wrote: > When using re patterns of the form 'A|B|C|...' the docs seem to > suggest that once any of A,B,C.. match, it is captured and no further > patterns are tried. But I am seeing, > > st=' Id Name Prov Type CopyOf BsId > Rd -Detailed_State- Adm Snp Usr VSize' > > p='Type *' > re.search(p,st).group() > 'Type ' > > p='Type *| *Type' > re.search(p,st).group() > ' Type' > > > Shouldn’t the second search return the same as the first, if further > patterns are not tried? > > The documentation appears to suggest the first match should be > returned, or am I misunderstanding? All alternatives are tried at a given starting position in the string before the algorithm advances to the next position. The second alternative " *Type", at least one space followed by the character sequence "Type" matches right after "Prov" in your example, therefore the first alternative, "Type" and any following spaces, which would match after "Prov " is never tried. Maybe you accidentally typed one extra " "? If you didn't " +Type" would be clearer.
Back to comp.lang.python | Previous | Next — Previous in thread | Find similar | Unroll thread
unexpected regexp behaviour using 'A|B|C.....' AlienBaby <matt.j.warren@gmail.com> - 2011-07-28 02:56 -0700 Re: unexpected regexp behaviour using 'A|B|C.....' Peter Otten <__peter__@web.de> - 2011-07-28 12:57 +0200
csiph-web