Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Fri, 03 Jun 2011 23:38:50 +0100
From: MRAB <python@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: how to avoid leading white spaces
References: <BANLkTikjY3U9Y24s-GOEyi8CNqCFLXuG6g@mail.gmail.com>	<4de8eef1$0$29996$c3e8da3$5496439d@news.astraweb.com>	<1237a287-10b0-4a2d-ba35-97b5238deda1@n11g2000yqf.googlegroups.com>	<94svm4Fe7eU1@mid.individual.net>	<isbkl301v7v@news2.newsguy.com> <4DE95C0C.6050900@stoneleaf.us>
In-Reply-To: <4DE95C0C.6050900@stoneleaf.us>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: python-list@python.org
Newsgroups: comp.lang.python
Message-ID: <mailman.2442.1307140744.9059.python-list@python.org>
Lines: 25
NNTP-Posting-Host: 82.94.164.166
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:6979

On 03/06/2011 23:11, Ethan Furman wrote:
> Chris Torek wrote:
>>> On 2011-06-03, rurpy@yahoo.com <rurpy@yahoo.com> wrote:
>> [prefers]
>>>> re.split ('[ ,]', source)
>>
>> This is probably not what you want in dealing with
>> human-created text:
>>
>> >>> re.split('[ ,]', 'foo bar, spam,maps')
>> ['foo', '', 'bar', '', 'spam', 'maps']
>
> I think you've got a typo in there... this is what I get:
>
> --> re.split('[ ,]', 'foo bar, spam,maps')
> ['foo', 'bar', '', 'spam', 'maps']
>
> I would add a * to get rid of that empty element, myself:
> --> re.split('[ ,]*', 'foo bar, spam,maps')
> ['foo', 'bar', 'spam', 'maps']
>
It's better to use + instead of * because you don't want it to be a
zero-width separator. The fact that it works should be treated as an
idiosyncrasy of the current re module, which can't split on a
zero-width match.