Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #108367
| Path | csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Peter Otten <__peter__@web.de> |
| Newsgroups | comp.lang.python |
| Subject | Re: Help for a complex RE |
| Date | Sun, 08 May 2016 18:15:25 +0200 |
| Organization | None |
| Lines | 55 |
| Message-ID | <mailman.520.1462724202.32212.python-list@python.org> (permalink) |
| References | <2aa55bd8-2ea4-41f7-b188-d45dff7d3bb7@googlegroups.com> <ngnomu$n3i$1@ger.gmane.org> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset="ISO-8859-1" |
| Content-Transfer-Encoding | 7Bit |
| X-Trace | news.uni-berlin.de x3bbS7uWm2OOIIVYzAgBUgr7/sQasKfzm+WmsElDo/lA== |
| Return-Path | <python-python-list@m.gmane.org> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; '"""': 0.05; "'a'": 0.07; 'matches': 0.07; 'stops': 0.07; '[1]:': 0.09; '[2]:': 0.09; '[3]:': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:Help': 0.10; 'python': 0.10; 'python.': 0.11; '":"': 0.16; '[4]:': 0.16; 'matching.': 0.16; 'r"""': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'received:t-ipconnect.de': 0.16; 'wrote:': 0.16; '>>>': 0.20; 'feb': 0.23; 'matching': 0.23; 'import': 0.24; 'header:User- Agent:1': 0.26; 'header:X-Complaints-To:1': 0.26; 'followed': 0.27; 'colon': 0.29; 'enhanced': 0.33; 'possible.': 0.36; 'to:addr :python-list': 0.36; 'subject:: ': 0.37; 'received:org': 0.37; 'why': 0.39; 'to:addr:python.org': 0.40; 'received:de': 0.40; 'space': 0.40; 'your': 0.60; 'default': 0.61; 'engine': 0.62; 'more': 0.63; 'compare:': 0.84; 'sergio': 0.84 |
| X-Injected-Via-Gmane | http://gmane.org/ |
| X-Gmane-NNTP-Posting-Host | p57bd8cd3.dip0.t-ipconnect.de |
| User-Agent | KNode/4.13.3 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.22 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| X-Mailman-Original-Message-ID | <ngnomu$n3i$1@ger.gmane.org> |
| X-Mailman-Original-References | <2aa55bd8-2ea4-41f7-b188-d45dff7d3bb7@googlegroups.com> |
| Xref | csiph.com comp.lang.python:108367 |
Show key headers only | View raw
Sergio Spina wrote:
> In the following ipython session:
>
>> Python 3.5.1+ (default, Feb 24 2016, 11:28:57)
>> Type "copyright", "credits" or "license" for more information.
>>
>> IPython 2.3.0 -- An enhanced Interactive Python.
>>
>> In [1]: import re
>>
>> In [2]: patt = r""" # the match pattern is:
>> ...: .+ # one or more characters
>> ...: [ ] # followed by a space
>> ...: (?=[@#D]:) # that is followed by one of the
>> ...: # chars "@#D" and a colon ":"
>> ...: """
>>
>> In [3]: pattern = re.compile(patt, re.VERBOSE)
>>
>> In [4]: m = pattern.match("Jun@i Bun#i @:Janji")
>>
>> In [5]: m.group()
>> Out[5]: 'Jun@i Bun#i '
>>
>> In [6]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji")
>>
>> In [7]: m.group()
>> Out[7]: 'Jun@i Bun#i @:Janji '
>>
>> In [8]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji #:Junji")
>>
>> In [9]: m.group()
>> Out[9]: 'Jun@i Bun#i @:Janji D:Banji '
>
> Why the regex engine stops the search at last piece of string?
> Why not at the first match of the group "@:"?
> What can it be a regex pattern with the following result?
>
>> In [1]: m = pattern.match("Jun@i Bun#i @:Janji D:Banji #:Junji")
>>
>> In [2]: m.group()
>> Out[2]: 'Jun@i Bun#i '
Compare:
>>> re.compile("a+").match("aaaa").group()
'aaaa'
>>> re.compile("a+?").match("aaaa").group()
'a'
By default pattern matching is "greedy" -- the ".+" part of your regex
matches as many characters as possible. Adding a ? like in ".+?" triggers
non-greedy matching.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Help for a complex RE Sergio Spina <sergio.am.spina@gmail.com> - 2016-05-08 08:18 -0700
Re: Help for a complex RE Peter Otten <__peter__@web.de> - 2016-05-08 18:15 +0200
Re: Help for a complex RE Sergio Spina <sergio.am.spina@gmail.com> - 2016-05-08 09:32 -0700
Re: Help for a complex RE Terry Reedy <tjreedy@udel.edu> - 2016-05-08 13:17 -0400
Re: Help for a complex RE Peter Otten <__peter__@web.de> - 2016-05-08 20:19 +0200
csiph-web