Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #6907
| From | Roy Smith <roy@panix.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: how to avoid leading white spaces |
| Date | 2011-06-02 23:44 -0400 |
| Organization | PANIX Public Access Internet and UNIX, NYC |
| Message-ID | <roy-751FAC.23443902062011@news.panix.com> (permalink) |
| References | <BANLkTikjY3U9Y24s-GOEyi8CNqCFLXuG6g@mail.gmail.com> <9e861b0e-e768-401b-b5ca-190f20830a08@s9g2000yqm.googlegroups.com> <94ph22FrhvU5@mid.individual.net> <roy-E2FA6F.21571602062011@news.panix.com> <is9ikg083h@news1.newsguy.com> |
In article <is9ikg083h@news1.newsguy.com>, Chris Torek <nospam@torek.net> wrote: > Python might be penalized by its use of Unicode here, since a > Boyer-Moore table for a full 16-bit Unicode string would need > 65536 entries (one per possible ord() value). I'm not sure what you mean by "full 16-bit Unicode string"? Isn't unicode inherently 32 bit? Or at least 20-something bit? Things like UTF-16 are just one way to encode it. In any case, while I could imagine building a 2^16 entry jump table, clearly it's infeasible (with today's hardware) to build a 2^32 entry table. But, there's nothing that really requires you to build a table at all. If I understand the algorithm right, all that's really required is that you can map a character to a shift value. For an 8 bit character set, an indexed jump table makes sense. For a larger character set, I would imagine you would do some heuristic pre-processing to see if your search string consisted only of characters in one unicode plane and use that fact to build a table which only indexes that plane. Or, maybe use a hash table instead of a regular indexed table. Not as fast, but only slower by a small constant factor, which is not a horrendous price to pay in a fully i18n world :-)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Re: how to avoid leading white spaces Chris Rebert <clp2@rebertia.com> - 2011-06-01 10:11 -0700
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-01 12:39 -0700
Re: how to avoid leading white spaces Karim <karim.liateni@free.fr> - 2011-06-01 22:34 +0200
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-02 13:21 +0000
Re: how to avoid leading white spaces Roy Smith <roy@panix.com> - 2011-06-02 21:57 -0400
Re: how to avoid leading white spaces MRAB <python@mrabarnett.plus.com> - 2011-06-03 03:41 +0100
Re: how to avoid leading white spaces Chris Torek <nospam@torek.net> - 2011-06-03 02:58 +0000
Re: how to avoid leading white spaces Roy Smith <roy@panix.com> - 2011-06-02 23:44 -0400
Re: how to avoid leading white spaces Chris Angelico <rosuav@gmail.com> - 2011-06-03 13:52 +1000
Re: how to avoid leading white spaces Chris Angelico <rosuav@gmail.com> - 2011-06-03 13:54 +1000
Re: how to avoid leading white spaces Chris Torek <nospam@torek.net> - 2011-06-03 04:30 +0000
Re: how to avoid leading white spaces Nobody <nobody@nowhere.com> - 2011-06-03 14:11 +0100
Re: how to avoid leading white spaces Nobody <nobody@nowhere.com> - 2011-06-03 14:18 +0100
Re: how to avoid leading white spaces Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-06-04 13:41 +1200
Re: how to avoid leading white spaces Nobody <nobody@nowhere.com> - 2011-06-04 20:44 +0100
Re: how to avoid leading white spaces Ian <hobson42@gmail.com> - 2011-06-06 22:04 +0100
Re: how to avoid leading white spaces Chris Torek <nospam@torek.net> - 2011-06-09 02:32 +0000
Re: how to avoid leading white spaces Thorsten Kampe <thorsten@thorstenkampe.de> - 2011-06-03 10:32 +0200
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-03 05:51 -0700
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-03 13:17 +0000
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-03 08:14 -0700
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-03 14:25 +0000
Re: how to avoid leading white spaces "D'Arcy J.M. Cain" <darcy@druid.net> - 2011-06-03 10:58 -0400
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-03 12:29 -0700
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-03 20:49 +0000
Re: how to avoid leading white spaces Chris Torek <nospam@torek.net> - 2011-06-03 21:45 +0000
Re: how to avoid leading white spaces Ethan Furman <ethan@stoneleaf.us> - 2011-06-03 15:11 -0700
Re: how to avoid leading white spaces MRAB <python@mrabarnett.plus.com> - 2011-06-03 23:38 +0100
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-05 22:47 -0700
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-05 22:44 -0700
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-06 16:08 +0000
Re: how to avoid leading white spaces Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-06 10:29 -0600
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-06 17:17 +0000
Re: how to avoid leading white spaces Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-06 11:40 -0600
Re: how to avoid leading white spaces Neil Cerutti <neilc@norwich.edu> - 2011-06-06 17:56 +0000
Re: how to avoid leading white spaces Ethan Furman <ethan@stoneleaf.us> - 2011-06-06 10:48 -0700
Re: how to avoid leading white spaces Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-06 11:42 -0600
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-04 02:05 +0000
Re: how to avoid leading white spaces MRAB <python@mrabarnett.plus.com> - 2011-06-04 03:24 +0100
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-04 04:59 +0000
Re: how to avoid leading white spaces Roy Smith <roy@panix.com> - 2011-06-03 22:30 -0400
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-04 05:14 +0000
Re: how to avoid leading white spaces Roy Smith <roy@panix.com> - 2011-06-04 09:39 -0400
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-05 00:44 +0000
Re: how to avoid leading white spaces rusi <rustompmody@gmail.com> - 2011-06-04 09:36 -0700
Re: how to avoid leading white spaces Nobody <nobody@nowhere.com> - 2011-06-04 21:02 +0100
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-05 01:01 +0000
Re: how to avoid leading white spaces Chris Angelico <rosuav@gmail.com> - 2011-06-04 16:04 +1000
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-05 23:03 -0700
Re: how to avoid leading white spaces Chris Torek <nospam@torek.net> - 2011-06-06 07:11 +0000
Re: how to avoid leading white spaces "Octavian Rasnita" <orasnita@gmail.com> - 2011-06-06 11:51 +0300
Re: how to avoid leading white spaces Chris Angelico <rosuav@gmail.com> - 2011-06-06 19:01 +1000
Re: how to avoid leading white spaces rusi <rustompmody@gmail.com> - 2011-06-06 07:33 -0700
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-07 11:37 -0700
Re: how to avoid leading white spaces Roy Smith <roy@panix.com> - 2011-06-07 20:30 -0400
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-08 07:38 -0700
Re: how to avoid leading white spaces rusi <rustompmody@gmail.com> - 2011-06-08 09:14 -0700
Re: how to avoid leading white spaces rusi <rustompmody@gmail.com> - 2011-06-08 01:27 -0700
Re: how to avoid leading white spaces Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-06 15:29 +0000
Re: how to avoid leading white spaces Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-06 10:06 -0600
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-07 09:00 -0700
Re: how to avoid leading white spaces Duncan Booth <duncan.booth@invalid.invalid> - 2011-06-08 09:01 +0000
Re: how to avoid leading white spaces "rurpy@yahoo.com" <rurpy@yahoo.com> - 2011-06-08 07:39 -0700
Re: how to avoid leading white spaces rusi <rustompmody@gmail.com> - 2011-06-05 04:17 -0700
csiph-web