X-Received: by 10.58.15.136 with SMTP id x8mr11109170vec.1.1404663859021; Sun, 06 Jul 2014 09:24:19 -0700 (PDT)
X-Received: by 10.182.191.68 with SMTP id gw4mr10208obc.15.1404663858794; Sun, 06 Jul 2014 09:24:18 -0700 (PDT)
Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!us.feeder.erje.net!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!peer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!i13no6439931qae.1!news-out.google.com!bp9ni2747igb.0!nntp.google.com!hn18no4077604igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.python
Date: Sun, 6 Jul 2014 09:24:18 -0700 (PDT)
In-Reply-To: <mailman.11545.1404662273.18130.python-list@python.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=70.196.68.8; posting-account=h3aEwQoAAACiuqX-oR3gvCVFm8lLHoWj
NNTP-Posting-Host: 70.196.68.8
References: <3f7ecf04-b881-4e79-aa59-893580090468@googlegroups.com> <CABicbJKUOsKW77LGkDF-1E52AmsPpTJq=GAUTA2PLpff+7fxNQ@mail.gmail.com> <53B96C0A.3030302@mrabarnett.plus.com> <mailman.11545.1404662273.18130.python-list@python.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d8f8d76d-0a47-4f59-8f09-da2a44cc1d2e@googlegroups.com>
Subject: Re: Question about metacharacter '*'
From: Rick Johnson <rantingrickjohnson@gmail.com>
Injection-Date: Sun, 06 Jul 2014 16:24:18 +0000
Content-Type: text/plain; charset=ISO-8859-1
X-Received-Bytes: 2408
X-Received-Body-CRC: 3893139180
Xref: csiph.com comp.lang.python:74031

On Sunday, July 6, 2014 10:50:13 AM UTC-5, Devin Jeanpierre wrote:
> In related news, the regexp I gave for numbers will match "1a".

Well of course it matched, because your pattern defines "one
or more consecutive digits". So it will match the "1" of
"1a" and the "11" of "11a" likewise.

As an aside i prefer to only utilize a "character set" when
nothing else will suffice. And in this case r"[0-9][0-9]*"
can be expressed just as correctly  (and less noisy IMHO) as
r"\d\d*".

============================================================
 INTERACTIVE SESSION: Python 2.x
============================================================
# Note: Grouping used for explicitness.

#
# Using character sets:
>>> import re
>>> re.search(r'([0-9][0-9]*)', '1a').groups()
('1',)
>>> re.search(r'([0-9][0-9]*)', '11a').groups()
('11',)
>>> re.search(r'([0-9][0-9]*)', '111aaa222').groups()
('111',)

#
# Same result without charactor sets:
>>> re.search(r'(\d\d*)', '1a').groups()
('1',)
>>> re.search(r'(\d\d*)', '11a').groups()
('11',)
>>> re.search(r'(\d\d*)', '111aaa222').groups()
('111',)