Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!feeder.news-service.com!feeder.erje.net!news.musoftware.de!wum.musoftware.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Gregory Ewing <greg.ewing@canterbury.ac.nz>
Newsgroups: comp.lang.python
Subject: Re: how to avoid leading white spaces
Date: Sat, 04 Jun 2011 13:41:33 +1200
Lines: 14
Message-ID: <94tgqfF4tiU1@mid.individual.net>
References: <BANLkTikjY3U9Y24s-GOEyi8CNqCFLXuG6g@mail.gmail.com> <9e861b0e-e768-401b-b5ca-190f20830a08@s9g2000yqm.googlegroups.com> <94ph22FrhvU5@mid.individual.net> <roy-E2FA6F.21571602062011@news.panix.com> <is9ikg083h@news1.newsguy.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net m93OQ4+9igmxx1lWfB64swSmFpwmfBeX1dok5duvIqzszDddoy
Cancel-Lock: sha1:ZugQtXp7ZeUhqpPydy0iDsbq7X0=
User-Agent: Mozilla Thunderbird 1.0.5 (Macintosh/20050711)
X-Accept-Language: en-us, en
In-Reply-To: <is9ikg083h@news1.newsguy.com>
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:6990

Chris Torek wrote:
> Python might be penalized by its use of Unicode here, since a
> Boyer-Moore table for a full 16-bit Unicode string would need
> 65536 entries

But is there any need for the Boyer-Moore algorithm to
operate on characters?

Seems to me you could just as well chop the UTF-16 up
into bytes and apply Boyer-Moore to them, and it would
work about as well.

-- 
Greg