Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #51203
| References | (8 earlier) <CAPTjJmqBs0PfQ5ahfURtSyCimPnFU-1HpFndO8h6T+Vbkj10Bg@mail.gmail.com> <51EFEC17.90303@gmail.com> <ksp493$m5g$1@ger.gmane.org> <mailman.5059.1374689751.3114.python-list@python.org> <51f0d0a0$0$29971$c3e8da3$5496439d@news.astraweb.com> |
|---|---|
| Date | 2013-07-25 17:58 +1000 |
| Subject | Re: RE Module Performance |
| From | Chris Angelico <rosuav@gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.5084.1374739093.3114.python-list@python.org> (permalink) |
On Thu, Jul 25, 2013 at 5:15 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Thu, 25 Jul 2013 04:15:42 +1000, Chris Angelico wrote:
>
>> If nobody had ever thought of doing a multi-format string
>> representation, I could well imagine the Python core devs debating
>> whether the cost of UTF-32 strings is worth the correctness and
>> consistency improvements... and most likely concluding that narrow
>> builds get abolished. And if any other language (eg ECMAScript) decides
>> to move from UTF-16 to UTF-32, I would wholeheartedly support the move,
>> even if it broke code to do so.
>
> Unfortunately, so long as most language designers are European-centric,
> there is going to be a lot of push-back against any attempt to fix (say)
> Javascript, or Java just for the sake of "a bunch of dead languages" in
> the SMPs. Thank goodness for emoji. Wait til the young kids start
> complaining that their emoticons and emoji are broken in Javascript, and
> eventually it will get fixed. It may take a decade, for the young kids to
> grow up and take over Javascript from the old-codgers, but it will happen.
I don't know that that'll happen like that. Emoticons aren't broken in
Javascript - you can use them just fine. You only start seeing
problems when you index into that string. People will start to wonder
why, for instance, a "500 character maximum" field deducts two from
the limit when an emoticon goes in. Example:
Type here:<br><textarea id=content oninput="showlimit(this)"></textarea>
<br>You have <span id=limit1>500</span> characters left (self.value.length).
<br>You have <span id=limit2>500</span> characters left (self.textLength).
<script>
function showlimit(self)
{
document.getElementById("limit1").innerHTML=500-self.value.length;
document.getElementById("limit2").innerHTML=500-self.textLength;
}
</script>
I've included an attribute documented here[1] as the "codepoint length
of the control's value", but in Chrome on Windows, it still counts
UTF-16 code units. However, I very much doubt that this will result in
language changes. People will just live with it. Chinese and Japanese
users will complain, perhaps, and the developers will write it off as
whinging, and just say "That's what the internet does". Maybe, if
you're really lucky, they'll acknowledge that "that's what JavaScript
does", but even then I doubt it'd result in language changes.
>> To my mind, exposing UTF-16 surrogates
>> to the application is a bug to be fixed, not a feature to be maintained.
>
> This, times a thousand.
>
> It is *possible* to have non-buggy string routines using UTF-16, but the
> implementation is a lot more complex than most language developers can be
> bothered with. I'm not aware of any language that uses UTF-16 internally
> that doesn't give wrong results for surrogate pairs.
The problem isn't the underlying representation, the problem is what
gets exposed to the application. Once you've decided to expose
codepoints to the app (abstracting over your UTF-16 underlying
representation), the change to using UTF-32, or mimicking PEP 393, or
some other structure, is purely internal and an optimization. So I
doubt any language will use UTF-16 internally and UTF-32 to the app.
It'd be needlessly complex.
ChrisA
[1] https://developer.mozilla.org/en-US/docs/Web/API/HTMLTextAreaElement
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-11 19:44 -0400
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-12 02:23 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:27 +1000
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 10:39 +0100
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:40 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 06:45 -0400
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 16:59 +0100
Re: RE Module Performance Peter Otten <__peter__@web.de> - 2013-07-12 18:15 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-13 02:21 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 13:58 -0400
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 05:37 +0000
Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-14 11:17 -0700
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 06:06 -0400
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-15 12:36 +0000
Dihedral Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 08:52 -0400
Re: Dihedral Joel Goldstick <joel.goldstick@gmail.com> - 2013-07-15 09:03 -0400
Re: Dihedral Wayne Werner <wayne@waynewerner.com> - 2013-07-15 17:43 -0500
Re: Dihedral Fábio Santos <fabiosantosart@gmail.com> - 2013-07-15 23:54 +0100
Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-16 08:59 +1000
Re: Dihedral Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-16 16:06 +1000
Re: Dihedral Stefan Behnel <stefan_ml@behnel.de> - 2013-07-24 20:08 +0200
Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:23 +1000
Re: Dihedral Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-24 20:15 -0400
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-13 08:16 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-12 17:13 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-24 06:40 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-24 23:48 +1000
Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:17 -0400
Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:19 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:34 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:02 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:39 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 08:47 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 02:27 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:14 +1000
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 12:07 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 05:18 +1000
RE: RE Module Performance "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2013-07-25 19:30 +0000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:06 -0600
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 09:00 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 05:56 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:56 +1000
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 13:52 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:15 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:15 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:58 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 09:22 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:07 +1000
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 18:09 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 08:19 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 16:59 -0600
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 09:24 +1000
Re: RE Module Performance Serhiy Storchaka <storchaka@gmail.com> - 2013-07-25 08:49 +0300
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 15:58 +1000
Re: RE Module Performance Jeremy Sanders <jeremy@jeremysanders.net> - 2013-07-25 14:36 +0100
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 15:26 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 01:36 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 17:18 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 03:27 +1000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:45 -0500
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-26 02:48 +0000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 21:20 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:36 -0700
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 08:46 -0700
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 06:28 +0000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 03:37 +0000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-26 22:12 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 05:04 +0000
Re: RE Module Performance Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-27 12:13 -0400
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:19 -0700
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:09 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:21 -0700
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-26 20:05 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-27 11:21 -0700
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-27 21:53 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 11:13 -0700
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:04 +0100
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:30 -0700
Re: RE Module Performance Lele Gaifax <lele@metapensiero.it> - 2013-07-28 22:45 +0200
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 22:01 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 07:01 -0700
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 16:38 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 15:45 +0100
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 17:13 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 18:39 +0200
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 18:14 +0100
Re: RE Module Performance Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-31 13:09 +1000
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-31 03:27 +1000
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-30 18:40 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 20:19 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 12:09 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 21:04 +0100
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:54 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-31 05:45 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 08:17 +0100
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 13:15 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 21:41 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:11 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 01:32 -0700
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:59 +0200
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:44 -0600
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-30 17:05 -0400
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:30 -0600
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 09:23 +0200
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:27 -0600
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 10:45 +0200
FSR and unicode compliance - was Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-28 09:52 -0600
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:23 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:44 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 21:55 +0200
Re: FSR and unicode compliance - was Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:52 +0000
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 04:43 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 12:57 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 05:56 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 07:20 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 15:49 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 09:31 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Heiko Wundram <modelnine@modelnine.org> - 2013-07-29 14:06 +0200
Re: FSR and unicode compliance - was Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-29 08:43 -0400
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 18:03 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 13:36 -0400
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 06:36 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:03 +0100
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 19:19 +0100
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:29 +0100
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 15:06 -0400
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 23:14 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 20:51 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 00:07 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-26 22:38 +0200
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-25 09:44 -0400
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:53 -0500
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-13 00:16 +0100
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-14 05:34 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-16 06:30 -0400
Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-18 13:17 -0700
csiph-web