Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #72623 > unrolled thread

Re: Unicode and Python - how often do you index strings?

Started byChris Angelico <rosuav@gmail.com>
First post2014-06-04 20:44 +1000
Last post2014-06-04 20:44 +1000
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Unicode and Python - how often do you index strings? Chris Angelico <rosuav@gmail.com> - 2014-06-04 20:44 +1000

#72623 — Re: Unicode and Python - how often do you index strings?

FromChris Angelico <rosuav@gmail.com>
Date2014-06-04 20:44 +1000
SubjectRe: Unicode and Python - how often do you index strings?
Message-ID<mailman.10696.1401878658.18130.python-list@python.org>
On Wed, Jun 4, 2014 at 8:10 PM, Peter Otten <__peter__@web.de> wrote:
> The indices used for slicing typically don't come out of nowhere. A simple
> example would be
>
> def strip_prefix(text, prefix):
>     if text.startswith(prefix):
>         text = text[len(prefix):]
>     return text
>
> If both prefix and text use UTF-8 internally the byte offset is already
> known. The question is then how we can preserve that information.

Almost completely useless. First off, it solves only the problem of
operating on the string at exactly some point where you just got an
index; and secondly, you don't always get that index from a string
method. Suppose, for instance, that you iterate over a string thus:

for i, ch in enumerate(string):
    if ch=='{': start = i
    elif ch=='}': return string[start:end+1]

Okay, so this could be done by searching, but for something more
complicated, I can imagine it being better to enumerate. (But "I can
imagine" is much weaker than "Here's code that we use in production",
which is why I asked the question.)

Incidentally, the above code highlights the first problem too. With
direct indexing, you can ask for inclusive or exclusive slicing by
adding or subtracting one from the index. If you do that with a
byte-position-retaining special integer, you lose the byte position.

ChrisA

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web