Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.019 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'subject:Python': 0.06; 'indexing': 0.07; 'string': 0.09; 'integers': 0.09; 'python': 0.11; 'centers,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'ideally,': 0.16; 'iterated': 0.16; 'iteration,': 0.16; 'subject:Unicode': 0.16; 'discussion': 0.18; "python's": 0.19; 'examples': 0.20; 'to:name:python-list@python.org': 0.22; 'unicode': 0.24; '(or': 0.24; 'question': 0.24; 'characters': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'question:': 0.31; 'critical': 0.32; 'beginning': 0.33; 'implemented': 0.33; 'actual': 0.34; "can't": 0.35; 'operations': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'indexed': 0.36; 'thanks': 0.36; 'subject:?': 0.36; 'two': 0.37; 'to:addr:python- list': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; 'how': 0.40; 'length': 0.61; 'here': 0.66; 'close': 0.67; 'production': 0.68; 'end.': 0.84; 'theories': 0.84; 'subject:you': 0.87; 'collective': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=QlvS1g8MH5FYjDLa9qFTStR8DykQquxYr7XNaoyNYZc=; b=NxdNJLC9xpcFrgZY4ooRArzFgxiaWLeE2yQ70mPi1wWVa+w7ATMsfVanCKp9z/2Ym8 dOBtcIMLDrwreEKP8y78jpbG9VdTTPvSB2VDKvovg8CHPA33mYkoAfcISvNoVyamGwUW btzwG5YOyj0nev4o6bUk2CrEVa4LX5mVKW2GsolPwy5YlywQMaHjfe0+aPSkO4sq9Wxn WxQXpmGHDienv42EeYPPEuAOA6GjDdPAw8EU5Lo6cTP/R3mJ/xY6d2AoeNutr/wBlw/w 324jZxfICV1eZoZgFkfS0Ou6rmIsA18ieXvSBePT6Xqz3EG0Quyc9EgARWtJ2bfDZt4w J0sA== MIME-Version: 1.0 X-Received: by 10.52.138.232 with SMTP id qt8mr9844080vdb.44.1401842394411; Tue, 03 Jun 2014 17:39:54 -0700 (PDT) Date: Wed, 4 Jun 2014 10:39:54 +1000 Subject: Unicode and Python - how often do you index strings? From: Chris Angelico To: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 18 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401842403 news.xs4all.nl 2931 [2001:888:2000:d::a6]:41456 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72564 A current discussion regarding Python's Unicode support centres (or centers, depending on how close you are to the cent[er]{2} of the universe) around one critical question: Is string indexing common? Python strings can be indexed with integers to produce characters (strings of length 1). They can also be iterated over from beginning to end. Lots of operations can be built on either one of those two primitives; the question is, how much can NOT be implemented efficiently over iteration, and MUST use indexing? Theories are great, but solid use-cases are better - ideally, examples from actual production code (actual code optional). I know the collective experience of python-list can't fail to bring up a few solid examples here :) Thanks in advance, all!! ChrisA