Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!news1.as3257.net!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Thu, 22 Nov 2012 04:00:38 +0000
From: MRAB <python@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Inconsistent behaviour os str.find/str.index when providing optional parameters
References: <9ecd357d-aaaa-4f4d-a987-a478e92b2052@googlegroups.com> <50ACD7FB.3060906@mrabarnett.plus.com> <k8k6sn$h47$1@ger.gmane.org>
In-Reply-To: <k8k6sn$h47$1@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: python-list@python.org
Newsgroups: comp.lang.python
Message-ID: <mailman.190.1353556838.29569.python-list@python.org>
Lines: 55
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:33783

On 2012-11-22 03:41, Terry Reedy wrote:
> On 11/21/2012 8:32 AM, MRAB wrote:
>> On 2012-11-21 12:43, Giacomo Alzetta wrote:
>>> I just came across this:
>
>   >>> 'spam'.find('')
> 0
>   >>> 'spam'.find('', 1)
> 1
>   >>> 'spam'.find('', 4)
> 4
>
>>>>>> 'spam'.find('', 5)
>>> -1
>>>
>>>
>>> Now, reading find's documentation:
>>>
>>>>>> print(str.find.__doc__)
>>> S.find(sub [,start [,end]]) -> int
>>>
>>> Return the lowest index in S where substring sub is found,
>>> such that sub is contained within S[start:end].  Optional
>>> arguments start and end are interpreted as in slice notation.
>
> This seems not to be true, as 'spam'[4:] == 'spam'[5:] == ''
>
It can't return 5 because 5 isn't an index in 'spam'.

It can't return 4 because 4 is below the start index.

>>> Return -1 on failure.
>>>
>>> Now, the empty string is a substring of every string so how can find
>>> fail?
>>> find, from the doc, should be generally be equivalent to
>>> S[start:end].find(substring) + start, except if the substring is not
>>> found but since the empty string is a substring of the empty string it
>>> should never fail.
>>>
>> [snip]
>> I think that returning -1 is correct (as far as returning -1 instead of
>> raising an exception like .index could be considered correct!) because
>> otherwise it whould be returning a non-existent index. For the string
>> "spam", the range is 0..4.
>
> I tend to agree, but perhaps the doc should be changed. In edge cases
> like this, there sometimes is no 'right' answer. I suspect that the
> current behavior is intentional. You might find a discussion on the tracker.
>

It's a special case, but the Zen has something to say about that! :-)

(The empty string is also the only substring which can start at len(S).)