Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!ecngs!feeder2.ecngs.de!novso.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <CAPTjJmoJ1FPzP8JyAufNx1B6jZgrCzfEmbU96K_JukVa_v5TiQ@mail.gmail.com>
References: <mailman.2549.1384371222.18130.python-list@python.org> <beqs6jF6ojmU1@mid.individual.net> <1f0ffad0-f9b1-4154-b048-510d8e38846e@googlegroups.com> <betrckFpdk9U1@mid.individual.net> <mailman.2823.1384757801.18130.python-list@python.org> <41f332dd-1c31-4699-9176-7e8589f9c8ae@googlegroups.com> <CAPTjJmoJ1FPzP8JyAufNx1B6jZgrCzfEmbU96K_JukVa_v5TiQ@mail.gmail.com>
Date: Mon, 18 Nov 2013 05:29:09 -0700
Subject: Re: Oh look, another language (ceylon)
From: Ian Kelly <ian.g.kelly@gmail.com>
To: Python <python-list@python.org>
Content-Type: multipart/alternative; boundary=047d7ba972c2df743f04eb72b10d
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2838.1384777759.18130.python-list@python.org>
Lines: 48
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:59857

--047d7ba972c2df743f04eb72b10d
Content-Type: text/plain; charset=ISO-8859-1

On Nov 18, 2013 3:06 AM, "Chris Angelico" <rosuav@gmail.com> wrote:
>
> I'm trying to figure this out. Reading the docs hasn't answered this.
> If each character in a string is a 32-bit Unicode character, and (as
> can be seen in the examples) string indexing and slicing are
> supported, then does string indexing mean counting from the beginning
> to see if there were any surrogate pairs?

The string reference says:

"""Since a String has an underlying UTF-16 encoding, certain operations are
expensive, requiring iteration of the characters of the string. In
particular, size requires iteration of the whole string, and get(), span(),
and segment() require iteration from the beginning of the string to the
given index."""

The get and span operations appear to be equivalent to indexing and slicing.

--047d7ba972c2df743f04eb72b10d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr"><br>
On Nov 18, 2013 3:06 AM, &quot;Chris Angelico&quot; &lt;<a href=3D"mailto:r=
osuav@gmail.com">rosuav@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt; I&#39;m trying to figure this out. Reading the docs hasn&#39;t answere=
d this.<br>
&gt; If each character in a string is a 32-bit Unicode character, and (as<b=
r>
&gt; can be seen in the examples) string indexing and slicing are<br>
&gt; supported, then does string indexing mean counting from the beginning<=
br>
&gt; to see if there were any surrogate pairs?</p>
<p dir=3D"ltr">The string reference says:</p>
<p dir=3D"ltr">&quot;&quot;&quot;Since a String has an underlying UTF-16 en=
coding, certain operations are expensive, requiring iteration of the charac=
ters of the string. In particular, size requires iteration of the whole str=
ing, and get(), span(), and segment() require iteration from the beginning =
of the string to the given index.&quot;&quot;&quot;</p>

<p dir=3D"ltr">The get and span operations appear to be equivalent to index=
ing and slicing.</p>

--047d7ba972c2df743f04eb72b10d--