Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <7xtxvzehhb.fsf@ruckus.brouhaha.com>
References: <f801e06f-f7b2-4aca-b352-66856a939746@googlegroups.com> <308df2af-abe7-4043-b199-0a39f440e0ab@googlegroups.com> <502f8a2a$0$29978$c3e8da3$5496439d@news.astraweb.com> <7xehn4vyya.fsf@ruckus.brouhaha.com> <mailman.3477.1345337181.4697.python-list@python.org> <7xfw7j3a1x.fsf@ruckus.brouhaha.com> <mailman.3479.1345342743.4697.python-list@python.org> <7xtxvzehhb.fsf@ruckus.brouhaha.com>
Date: Sun, 19 Aug 2012 13:01:46 +1000
Subject: Re: How do I display unicode value stored in a string variable using ord()
From: Chris Angelico <rosuav@gmail.com>
To: python-list@python.org
Content-Type: text/plain; charset=ISO-8859-1
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.3481.1345345309.4697.python-list@python.org>
Lines: 24
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:27342

On Sun, Aug 19, 2012 at 12:35 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> Chris Angelico <rosuav@gmail.com> writes:
>>>>> "asdfqwer"[4:]
>> 'qwer'
>>
>> That's a not uncommon operation when parsing strings or manipulating
>> data. You'd need to completely rework your algorithms to maintain a
>> position somewhere.
>
> Scanning 4 characters (or a few dozen, say) to peel off a token in
> parsing a UTF-8 string is no big deal.  It gets more expensive if you
> want to index far more deeply into the string.  I'm asking how often
> that is done in real code.  Obviously one can concoct hypothetical
> examples that would suffer.

Sure, four characters isn't a big deal to step through. But it still
makes indexing and slicing operations O(N) instead of O(1), plus you'd
have to zark the whole string up to where you want to work. It'd be
workable, but you'd have to redo your algorithms significantly; I
don't have a Python example of parsing a huge string, but I've done it
in other languages, and when I can depend on indexing being a cheap
operation, I'll happily do exactly that.

ChrisA