Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!feeds.phibee-telecom.net!dedekind.zen.co.uk!zen.net.uk!hamilton.zen.co.uk!prichard.zen.co.uk.POSTED!not-for-mail From: Nobody Subject: Re: how to avoid leading white spaces Date: Fri, 03 Jun 2011 14:18:40 +0100 User-Agent: Pan/0.14.2 (This is not a psychotic episode. It's a cleansing moment of clarity.) Message-Id: Newsgroups: comp.lang.python References: <9e861b0e-e768-401b-b5ca-190f20830a08@s9g2000yqm.googlegroups.com> <94ph22FrhvU5@mid.individual.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Lines: 12 Organization: Zen Internet NNTP-Posting-Host: 967b4d4e.news.zen.co.uk X-Trace: DXC=10MU0cKo[F;G>2Zgg858V:0g@SS;SF6n7RiiCXJE[K>77jE\ Python might be penalized by its use of Unicode here, since a > Boyer-Moore table for a full 16-bit Unicode string would need > 65536 entries (one per possible ord() value). However, if the > string being sought is all single-byte values, a 256-element > table suffices; re.compile(), at least, could scan the pattern > and choose an appropriate underlying search algorithm. The table can be truncated or compressed at the cost of having to map codepoints to table indices. Or use a hash table instead of an array.