Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'languages,': 0.04; 'string.': 0.04; 'indexing': 0.07; 'parsing': 0.07; 'utf-8': 0.07; 'subject:How': 0.09; 'python': 0.09; 'subject:()': 0.09; 'subject:string': 0.09; 'subject:using': 0.09; 'sure,': 0.09; 'through.': 0.09; 'aug': 0.13; 'index': 0.13; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'happily': 0.16; 'hypothetical': 0.16; 'operation,': 0.16; 'redo': 0.16; 'say)': 0.16; 'somewhere.': 0.16; 'subject:unicode': 0.16; 'subject:variable': 0.16; 'uncommon': 0.16; 'string': 0.17; 'wrote:': 0.17; 'cheap': 0.17; 'string,': 0.17; 'examples': 0.18; 'obviously': 0.18; '(or': 0.18; 'code.': 0.20; 'received:209.85.214.174': 0.21; 'example': 0.23; 'work.': 0.23; "i've": 0.23; 'paul': 0.24; 'header:In-Reply-To:1': 0.25; 'message-id:@mail.gmail.com': 0.27; 'chris': 0.28; 'writes:': 0.29; "i'm": 0.29; 'that.': 0.30; 'gets': 0.32; 'asking': 0.32; 'to:addr:python-list': 0.33; 'operations': 0.33; 'received:google.com': 0.34; 'done': 0.34; 'pm,': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'characters': 0.36; 'data.': 0.36; 'expensive': 0.36; "i'll": 0.36; 'being': 0.37; 'received:209': 0.37; 'far': 0.37; 'subject:: ': 0.38; 'instead': 0.39; 'to:addr:python.org': 0.39; 'received:209.85.214': 0.39; 'step': 0.39; 'where': 0.40; 'skip:" 10': 0.40; 'header:Received:5': 0.40; 'your': 0.60; 'real': 0.61; 'more': 0.63; 'deeply': 0.66; "it'd": 0.84; 'subject:value': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=gwwNpfdp7s6akAOYHrtiql2ph4NGZLJ85tki3Gd4pR0=; b=DODTiOCuwPG9hRbPE1TJtOBZ47hOSF6BZ4pKwa6rdLVgDzBM/LmkzQEV4O7JT+CgUV Bq+cVTGiSC4Gc4F8uIZARJtm/JwPchubebWrLr8BmrNozAGKKBgNgUDUUjDMlnEhXStS WtTaJ0apy3/UjQvtWXgxj0sW8T1allZKiK7+URMoAYsJOJpT5Dcd6sw4sZUpwze0EyDD erqjbOon7osmzXjD7Ia7k0yZL1f3cgA6aemUk21XQqwZPx6eoxc+NBThghIntlYeZ1QQ lDigGZYaG+mZFDYEhzLsrVwOTZjyCHNjKpz+dtFPE6nkHPttZyp7dmgow4KBteDPAG1J IeDw== MIME-Version: 1.0 In-Reply-To: <7xtxvzehhb.fsf@ruckus.brouhaha.com> References: <308df2af-abe7-4043-b199-0a39f440e0ab@googlegroups.com> <502f8a2a$0$29978$c3e8da3$5496439d@news.astraweb.com> <7xehn4vyya.fsf@ruckus.brouhaha.com> <7xfw7j3a1x.fsf@ruckus.brouhaha.com> <7xtxvzehhb.fsf@ruckus.brouhaha.com> Date: Sun, 19 Aug 2012 13:01:46 +1000 Subject: Re: How do I display unicode value stored in a string variable using ord() From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 24 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1345345309 news.xs4all.nl 6951 [2001:888:2000:d::a6]:50090 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:27342 On Sun, Aug 19, 2012 at 12:35 PM, Paul Rubin wrote: > Chris Angelico writes: >>>>> "asdfqwer"[4:] >> 'qwer' >> >> That's a not uncommon operation when parsing strings or manipulating >> data. You'd need to completely rework your algorithms to maintain a >> position somewhere. > > Scanning 4 characters (or a few dozen, say) to peel off a token in > parsing a UTF-8 string is no big deal. It gets more expensive if you > want to index far more deeply into the string. I'm asking how often > that is done in real code. Obviously one can concoct hypothetical > examples that would suffer. Sure, four characters isn't a big deal to step through. But it still makes indexing and slicing operations O(N) instead of O(1), plus you'd have to zark the whole string up to where you want to work. It'd be workable, but you'd have to redo your algorithms significantly; I don't have a Python example of parsing a huge string, but I've done it in other languages, and when I can depend on indexing being a cheap operation, I'll happily do exactly that. ChrisA