Groups | Search | Server Info | Login | Register
| Newsgroups | perl.unicode |
|---|---|
| Subject | Re: Encode UTF-8 optimizations |
| References | <201608121731.32716@pali> <20160822130518.GA9176@pali> <62e8d4d6-b474-b037-5d77-5f67d3e20371@khwilliamson.com> <201608222247.46077@pali> |
| Message-ID | <4e654c06-2b51-31a8-c2ab-f98c4dcf421d@khwilliamson.com> (permalink) |
| Date | 2016-08-22 15:19 -0600 |
| From | public@khwilliamson.com (Karl Williamson) |
On 08/22/2016 02:47 PM, pali@cpan.org wrote: >> > And I think you misunderstand when is_utf8_char_slow() is called. It is >> > called only when the next byte in the input indicates that the only >> > legal UTF-8 that might follow would be for a code point that is at least >> > U+200000, almost twice as high as the highest legal Unicode code point. >> > It is a Perl extension to handle such code points, unlike other >> > languages. But the Perl core is not optimized for them, nor will it be. >> > My point is that is_utf8_char_slow() will only be called in very >> > specialized cases, and we need not make those cases have as good a >> > performance as normal ones. > In strict mode, there is absolutely no need to call is_utf8_char_slow(). As in strict > mode such sequence must be always invalid (it is above last valid Unicode character) > This is what I tried to tell. > > And currently is_strict_utf8_string_loc() first calls isUTF8_CHAR() (which could call > is_utf8_char_slow()) and after that is check for UTF8_IS_SUPER(). I only have time to respond to this portion just now. The code could be tweaked to call UTF8_IS_SUPER first, but I'm asserting that an optimizing compiler will see that any call to is_utf8_char_slow() is pointless, and will optimize it out.
Back to perl.unicode | Previous | Next — Previous in thread | Next in thread | Find similar
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-12 17:31 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-18 23:06 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-19 10:42 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-20 19:10 -0600
Re: Encode UTF-8 optimizations pagaltzis@gmx.de (Aristotle Pagaltzis) - 2016-08-21 04:33 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-20 20:55 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-21 10:34 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-21 08:49 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-22 15:05 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-22 13:43 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-22 22:47 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-22 15:19 -0600
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-22 15:38 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-22 23:45 +0200
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-22 23:39 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-24 22:49 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-25 09:48 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-29 09:00 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-08-31 23:43 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-08-31 21:27 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-09-01 09:30 +0200
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-09-25 12:06 +0200
Re: Encode UTF-8 optimizations public@khwilliamson.com (Karl Williamson) - 2016-09-25 10:49 -0600
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-10-27 10:25 +0200
Re: Encode UTF-8 optimizations pali@cpan.org - 2016-11-01 10:53 +0100
csiph-web