Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subsequent': 0.05; 'subject:Python': 0.06; 'indexing': 0.07; 'string': 0.09; '%s"': 0.09; 'counting': 0.09; 'cc:addr:python-list': 0.11; 'finds': 0.16; 'for,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'internally': 0.16; 'presume': 0.16; 'subject:Unicode': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'example': 0.22; 'cc:addr:python.org': 0.22; 'string,': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'tim': 0.29; 'characters': 0.30; 'start,': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'chase': 0.31; 'thanks!': 0.32; 'beginning': 0.33; 'received:google.com': 0.35; 'subject:?': 0.36; 'mapping': 0.38; 'skip:r 30': 0.69; 'subject:you': 0.87; 'to:none': 0.92 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=jd6AAeD9mI2GgCh2spRrpQ/OkVe/uwoNGU9yq+z2Klk=; b=TwN4qN7P7+kvVjJ3qDGHTufiQ0VAAJ+jS4yHLcBpIkQqmx93L5oslXNCVwOhxzPjHm PSxELZJ4D8chrpIAf9PfciUExg2kJUa2gNVGJUkgTCjjW7kiNOH3LVj7hN8FP5HH6LPr J4M4IkFh6ZzpiDaUMu3wwV+DI9FavDGiZiMessX+u7T9lFhnj6eF8KNr0o20XsmLORb6 IhU0xRBdK1OE5Pek+H/jvWSDUFEuB1IDhQlLH+zChAqx6qm5FUbYZiL4VesLMbF8iuub vx6aYT8i0ePIgVREo12KFSRTMwK4erANnhpmL4arQdI1VdlQ6aNMaLLMoOh1/MwePkLZ T6kQ== MIME-Version: 1.0 X-Received: by 10.52.76.33 with SMTP id h1mr10037588vdw.45.1401848189629; Tue, 03 Jun 2014 19:16:29 -0700 (PDT) In-Reply-To: <20140603201154.38b47afb@bigbox.christie.dr> References: <20140603201154.38b47afb@bigbox.christie.dr> Date: Wed, 4 Jun 2014 12:16:29 +1000 Subject: Re: Unicode and Python - how often do you index strings? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 20 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401848197 news.xs4all.nl 2841 [2001:888:2000:d::a6]:48570 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72574 On Wed, Jun 4, 2014 at 11:11 AM, Tim Chase wrote: > I then take row 2 and use it to make a mapping of header-name to a > slice-object for slicing the subsequent strings: > > slice(i.start(), i.end()) > > print("EmpID = %s" % row[header_map["EMPID"]].strip()) > print("Name = %s" % row[header_map["NAME"]].strip()) > > which I presume uses string indexing under the hood. Yes, it's definitely going to be indexing. If strings were represented internally in UTF-8, each of those calls would need to scan from the beginning of the string, counting and discarding characters until it finds the place to start, then counting and retaining characters until it finds the place to stop. Definite example of what I'm looking for, thanks! ChrisA