Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
| Newsgroups | perl.unicode |
|---|---|
| Subject | Re: UTF-8 encoding & decoding |
| References | <20160505143719.GA6420@pali> |
| Message-ID | <572CB711.2060307@khwilliamson.com> (permalink) |
| Date | 2016-05-06 09:24 -0600 |
| From | public@khwilliamson.com (Karl Williamson) |
On 05/05/2016 08:37 AM, Pali Rohár wrote:
> Hi!
>
> I though that I understand UTF-8 encoding/decoding done in perl until I
> looked into source code of Encode package... (exactly sub encode_utf8)
>
> Before... I only read description of Encode package (not source code):
> https://metacpan.org/pod/Encode#UTF-8-vs.-utf8-vs.-UTF8
>
> I tried to find some more information (ideally those which answer my
> question) but without success. Can you help me? My questions are:
>
> 1. What is difference between those two calls?
>
> utf8::encode($str);
>
> and
>
> $str = Encode::encode('utf8', $str);
>
> 2. What is difference between those?
>
> utf8::decode($str);
> $str = Encode::decode_utf8($str);
Each pair of functions is supposed to do essentially the same thing. I
have not studied them to know what subtle differences there may be.
>
> 3. Where is implementation of utf8::encode/decode functions? It is not
> in utf8.pm, nor in utf8_heavy.pl and also not in unicore/Heavy.pl. And
> what those functions doing?
The implementation is in universal.c. But these are just wrappers for
sv_utf8_encode and sv_utf8_decode, which are implemented in sv.c. Their
documentation is in perlapi. It should match the documentation of
utf8::decode and utf8::encode, whose documentation is in utf8.pm. (I
myself have a hard time mapping the names chosen for these operations
with what they actually do)
>
Back to perl.unicode | Previous | Next — Next in thread | Find similar
Re: UTF-8 encoding & decoding public@khwilliamson.com (Karl Williamson) - 2016-05-06 09:24 -0600 Re: UTF-8 encoding & decoding pagaltzis@gmx.de (Aristotle Pagaltzis) - 2016-05-15 05:05 +0200
csiph-web