Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #379597

Re: Simple(?) Unicode questions

From Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups comp.lang.c
Subject Re: Simple(?) Unicode questions
Date 2023-12-09 13:46 -0800
Organization None to speak of
Message-ID <877clnxd5o.fsf@nosuchdomain.example.com> (permalink)
References <ul13hl$24kg5$1@dont-email.me> <=H=fRiU4BbThlUWDM@bongo-ra.co>

Show all headers | View raw


Spiros Bousbouras <spibou@gmail.com> writes:
[...]
> If I want to use directly unicode codepoints I will encode them as
> unsigned long  which is guaranteed to be wide enough to cover the whole
> range of codepoints values ; in contrast , it is conforming for  wchar_t  
> to cover no greater range than  char.
[...]

The C standard requires wchar_t to be: "an integer type whose range of
values can represent distinct codes for all members of the largest
extended character set specified among the supported locales".

Yes, it's conforming for wchar_t to cover a range no wider than char,
but only if the implementation has no extended character sets wider than
char.

On Linux-based systems, wchar_t is typically 32 bits, more than enough
to cover all Unicode codepoints.  On Windows, however, wchar_t is
generally only 16 bits, which (I think) is non-conforming.

(Microsoft started to support Unicode when the standard specified only
up to 2**16 codepoints, so UCS-2 was sufficient.  When Unicode expanded
beyond the Basic Multilingual Plane, Microsoft handled it by supporting
UTF-16, a variable-length encoding composed of 16-bit characters.
Inertia made it too difficult to expand wchar_t from 16 to 32 bits.)

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

Back to comp.lang.c | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Simple(?) Unicode questions Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2023-12-09 08:04 +0100
  Re: Simple(?) Unicode questions Richard Damon <richard@damon-family.org> - 2023-12-09 08:01 -0500
  Re: Simple(?) Unicode questions jak <nospam@please.ty> - 2023-12-09 15:59 +0100
    Re: Simple(?) Unicode questions Spiros Bousbouras <spibou@gmail.com> - 2023-12-09 15:32 +0000
      Re: Simple(?) Unicode questions jak <nospam@please.ty> - 2023-12-09 18:57 +0100
  Re: Simple(?) Unicode questions Spiros Bousbouras <spibou@gmail.com> - 2023-12-09 15:12 +0000
    Re: Simple(?) Unicode questions Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2023-12-09 17:59 +0100
      Re: Simple(?) Unicode questions Spiros Bousbouras <spibou@gmail.com> - 2023-12-09 17:19 +0000
        Re: Simple(?) Unicode questions Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2023-12-09 18:43 +0100
    Re: Simple(?) Unicode questions Spiros Bousbouras <spibou@gmail.com> - 2023-12-09 17:40 +0000
    Re: Simple(?) Unicode questions Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-12-09 13:46 -0800
  Re: Simple(?) Unicode questions spender <spender@yeah.net> - 2023-12-13 11:05 +0800
    Re: Simple(?) Unicode questions Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2023-12-13 04:24 +0100
    Re: Simple(?) Unicode questions Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-12-12 19:28 -0800
    Re: Simple(?) Unicode questions James Kuyper <jameskuyper@alumni.caltech.edu> - 2023-12-13 00:40 -0500
      Re: Simple(?) Unicode questions Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-01-19 07:43 -0800
    Re: Simple(?) Unicode questions Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2023-12-13 14:56 +0000
      Re: Simple(?) Unicode questions Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-25 02:03 -0800
        Re: Simple(?) Unicode questions Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-12-25 14:43 -0800
          Re: Simple(?) Unicode questions Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-01-20 09:33 -0800
            Re: Simple(?) Unicode questions Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-01-20 14:19 -0800
              Re: Simple(?) Unicode questions Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-01-24 20:38 -0800

csiph-web