Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #72574

Re: Unicode and Python - how often do you index strings?

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'subsequent': 0.05; 'subject:Python': 0.06; 'indexing': 0.07; 'string': 0.09; '%s"': 0.09; 'counting': 0.09; 'cc:addr:python-list': 0.11; 'finds': 0.16; 'for,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'internally': 0.16; 'presume': 0.16; 'subject:Unicode': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'example': 0.22; 'cc:addr:python.org': 0.22; 'string,': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'tim': 0.29; 'characters': 0.30; 'start,': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'chase': 0.31; 'thanks!': 0.32; 'beginning': 0.33; 'received:google.com': 0.35; 'subject:?': 0.36; 'mapping': 0.38; 'skip:r 30': 0.69; 'subject:you': 0.87; 'to:none': 0.92
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=jd6AAeD9mI2GgCh2spRrpQ/OkVe/uwoNGU9yq+z2Klk=; b=TwN4qN7P7+kvVjJ3qDGHTufiQ0VAAJ+jS4yHLcBpIkQqmx93L5oslXNCVwOhxzPjHm PSxELZJ4D8chrpIAf9PfciUExg2kJUa2gNVGJUkgTCjjW7kiNOH3LVj7hN8FP5HH6LPr J4M4IkFh6ZzpiDaUMu3wwV+DI9FavDGiZiMessX+u7T9lFhnj6eF8KNr0o20XsmLORb6 IhU0xRBdK1OE5Pek+H/jvWSDUFEuB1IDhQlLH+zChAqx6qm5FUbYZiL4VesLMbF8iuub vx6aYT8i0ePIgVREo12KFSRTMwK4erANnhpmL4arQdI1VdlQ6aNMaLLMoOh1/MwePkLZ T6kQ==
MIME-Version 1.0
X-Received by 10.52.76.33 with SMTP id h1mr10037588vdw.45.1401848189629; Tue, 03 Jun 2014 19:16:29 -0700 (PDT)
In-Reply-To <20140603201154.38b47afb@bigbox.christie.dr>
References <CAPTjJmr4iHdaCy61w2rz-oL6FcarRzzTeEU44Fxn2Z=gS0fh-Q@mail.gmail.com> <20140603201154.38b47afb@bigbox.christie.dr>
Date Wed, 4 Jun 2014 12:16:29 +1000
Subject Re: Unicode and Python - how often do you index strings?
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.10665.1401848197.18130.python-list@python.org> (permalink)
Lines 20
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1401848197 news.xs4all.nl 2841 [2001:888:2000:d::a6]:48570
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:72574

Show key headers only | View raw


On Wed, Jun 4, 2014 at 11:11 AM, Tim Chase
<python.list@tim.thechases.com> wrote:
> I then take row 2 and use it to make a mapping of header-name to a
> slice-object for slicing the subsequent strings:
>
>       slice(i.start(), i.end())
>
>     print("EmpID = %s" % row[header_map["EMPID"]].strip())
>     print("Name = %s" % row[header_map["NAME"]].strip())
>
> which I presume uses string indexing under the hood.

Yes, it's definitely going to be indexing. If strings were represented
internally in UTF-8, each of those calls would need to scan from the
beginning of the string, counting and discarding characters until it
finds the place to start, then counting and retaining characters until
it finds the place to stop. Definite example of what I'm looking for,
thanks!

ChrisA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Unicode and Python - how often do you index strings? Chris Angelico <rosuav@gmail.com> - 2014-06-04 12:16 +1000

csiph-web