Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #41308
| From | Roy Smith <roy@panix.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: String performance regression from python 3.2 to 3.3 |
| Date | 2013-03-16 09:29 -0400 |
| Organization | PANIX Public Access Internet and UNIX, NYC |
| Message-ID | <roy-095055.09290116032013@news.panix.com> (permalink) |
| References | (5 earlier) <mailman.3278.1363220353.2939.python-list@python.org> <2202673.rtQqbKup0V@PointedEars.de> <ki0qf2$s69$1@ger.gmane.org> <mailman.3352.1363406999.2939.python-list@python.org> <51440235$0$29965$c3e8da3$5496439d@news.astraweb.com> |
In article <51440235$0$29965$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > UTF-32 is a *fixed width* storage mechanism where every code point takes > exactly four bytes. Since the entire Unicode range will fit in four > bytes, that ensures that every code point is covered, and there is no > need to walk the string every time you perform an indexing operation. But > it means that if you're one of the 99.9% of users who mostly use > characters in the BMP, your strings take twice as much space as > necessary. If you only use Latin1 or ASCII, your strings take four times > as much space as necessary. I suspect that eventually, UTF-32 will win out. I'm not sure when "eventually" is, but maybe sometime in the next 10-20 years. When I was starting out, the computer industry had a variety of character encodings designed to take up less than 8 bits per character. Sixbit, Rad-50, BCD, and so on. Each of these added complexity and took away character set richness, but saved a few bits. At the time, memory was so expensive and so precious, it was worth it. Over the years, memory became cheaper, address spaces grew from 16 to 32 to 64 bits, and the pressure to use richer character sets kept increasing. So, now we're at the point where people are (mostly) using Unicode, but are still arguing about which encoding to use because the "best" complexity/space tradeoff isn't obvious. At some point in the future, memory will be so cheap, and so ubiquitous, that people will be wondering why us neanderthals bothered worrying about trying to save 16 bits per character. Of course, by then, we'll be migrating to Mongocode and arguing about UTF-64 :-)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
A reply for rusi (FSR) jmfauth <wxjmfauth@gmail.com> - 2013-03-13 02:36 -0700
Re: A reply for rusi (FSR) rusi <rustompmody@gmail.com> - 2013-03-13 03:07 -0700
String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-13 03:11 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-13 21:59 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-13 09:49 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 10:43 +1100
Re: String performance regression from python 3.2 to 3.3 MRAB <python@mrabarnett.plus.com> - 2013-03-14 00:52 +0000
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 11:55 +1100
Re: String performance regression from python 3.2 to 3.3 MRAB <python@mrabarnett.plus.com> - 2013-03-14 02:01 +0000
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-14 04:05 +0000
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 17:47 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-14 03:48 -0700
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-14 19:14 -0400
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-14 20:48 -0400
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 10:07 -0700
RE: String performance regression from python 3.2 to 3.3 Andriy Kornatskyy <andriy.kornatskyy@live.com> - 2013-03-15 21:04 +0300
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-13 22:35 -0400
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 17:21 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-13 18:42 +0100
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 11:19 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-16 03:44 +0100
Re: String performance regression from python 3.2 to 3.3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-16 03:56 +0000
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 21:26 -0700
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 08:47 +0000
Re: String performance regression from python 3.2 to 3.3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-17 09:00 +1100
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 18:10 -0400
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 14:59 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-16 05:12 +0100
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 15:20 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 22:21 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 15:09 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 21:35 -0700
Re: String performance regression from python 3.2 to 3.3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-16 04:56 +0000
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-16 01:05 -0400
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 05:38 +0000
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 05:25 +0000
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 09:29 -0400
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-16 09:39 -0700
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 14:00 -0400
Re: String performance regression from python 3.2 to 3.3 jmfauth <wxjmfauth@gmail.com> - 2013-03-16 13:42 -0700
Re: A reply for rusi (FSR) Chris Angelico <rosuav@gmail.com> - 2013-03-13 21:32 +1100
csiph-web