Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #395822
| Path | csiph.com!eternal-september.org!feeder.eternal-september.org!nntp.eternal-september.org!eternal-september.org!.POSTED!not-for-mail |
|---|---|
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
| Newsgroups | comp.lang.c |
| Subject | Re: u8"" c11 c23 |
| Date | Mon, 15 Dec 2025 11:13:21 -0800 |
| Organization | A noiseless patient Spider |
| Lines | 86 |
| Message-ID | <86h5trtv72.fsf@linuxsc.com> (permalink) |
| References | <10d5vck$3kufd$1@dont-email.me> <875xc9p674.fsf@example.invalid> |
| MIME-Version | 1.0 |
| Content-Type | text/plain; charset=us-ascii |
| Injection-Date | Mon, 15 Dec 2025 19:13:24 +0000 (UTC) |
| Injection-Info | dont-email.me; posting-host="3b66f75cc16331490dd39d06d7ef9603"; logging-data="2231266"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1925XR7yA1mx4Nj5hgQtGvdKN3LUkKvASc=" |
| User-Agent | Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) |
| Cancel-Lock | sha1:vJ3eC3voj4DI4GIcC3jnVMNtC6k= sha1:rbuAHst3NrXJkyLFwUottEzOz+8= |
| Xref | csiph.com comp.lang.c:395822 |
Show key headers only | View raw
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> Thiago Adams <thiago.adams@gmail.com> writes:
>
>> speaking on signed x unsigned,
>>
>> u8"a" in C11 had the type char [N]. Normally char is signed
>
> I would have said "commonly" rather than "normally". Not an
> important point.
>
>> in C23 it is unsigned char8_t [N].
>>
>> when converting code from c11 to c23 we have a error here
>> const char* s = u8""
>>
>>
>> I generally "cast char* " to "unsigned char*" when handling
>> something with utf8. I am not u8"" , I use just " " with utf8
>> encoded source code and I just assume const char* is utf8.
>
> That raises another issue.
>
> The <uchar.h> header was introduced in C99. In C99, C11, and C17,
> that header defines char16_t and char32_t. C23 introduces char8_t.
>
> There doesn't seem to be any way, other than checking the value of
> __STDC_VERSION__ to determine whether char8_t is defined or not.
> There are not *_MIN or *_MAX macros for these types, either in
> <uchar.h> or in <limits.h>. A test program I just wrote would have
> been a little simpler if I could have used `#ifdef CHAR8_MAX`.
>
> Here's the test program :
>
> #include <stdio.h>
> #include <uchar.h>
>
> #define TYPEOF(x) \
> (_Generic(x, \
> char: "char", \
> signed char: "signed char", \
> unsigned char: "unsigned char", \
> short: "short", \
> unsigned short: "unsigned short", \
> int: "int", \
> unsigned int: "unsigned int", \
> long: "long", \
> unsigned long: "unsigned long", \
> long long: "long long", \
> unsigned long long: "unsigned long long"))
>
> int main(void) {
> printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
> printf("u8\"a\"[0] is of type %s\n",
> TYPEOF(u8"a"[0]));
> #if __STDC_VERSION__ >= 202311L
> printf("char8_t is %s\n", TYPEOF((char8_t)0));
> #endif
> printf("char16_t is %s\n", TYPEOF((char16_t)0));
> printf("char32_t is %s\n", TYPEOF((char32_t)0));
> }
>
> Its output with `gcc -std=c17` :
>
> __STDC_VERSION__ = 201710L
> u8"a"[0] is of type char
> char16_t is unsigned short
> char32_t is unsigned int
>
> Its output with `gcc -std=c23` :
>
> __STDC_VERSION__ = 202311L
> u8"a"[0] is of type unsigned char
> char8_t is unsigned char
> char16_t is unsigned short
> char32_t is unsigned int
Since C23 defines char8_t to be the same type as unsigned char,
it seems better to just define it when it isn't there:
#include <limits.h>
#if CHAR_BIT == 8 && __STDC_VERSION__ < 202311
typedef unsigned char char8_t;
#endif
Back to comp.lang.c | Previous | Next — Previous in thread | Next in thread | Find similar
u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-10-20 15:35 -0300
Re: u8"" c11 c23 Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-10-20 15:19 -0700
Re: u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-10-21 07:57 -0300
Re: u8"" c11 c23 Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-10-21 10:26 -0700
Re: u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-10-21 15:04 -0300
Re: u8"" c11 c23 Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-10-21 11:51 -0700
Re: u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-10-21 16:17 -0300
Re: u8"" c11 c23 Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-12-15 11:13 -0800
Re: u8"" c11 c23 Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-12-15 14:27 -0800
Re: u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-12-16 07:57 -0300
Re: u8"" c11 c23 Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-12-16 04:17 -0800
Re: u8"" c11 c23 Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-12-21 22:37 -0800
Re: u8"" c11 c23 Bonita Montero <Bonita.Montero@gmail.com> - 2025-10-21 10:35 +0200
Re: u8"" c11 c23 Thiago Adams <thiago.adams@gmail.com> - 2025-10-21 07:07 -0300
Re: u8"" c11 c23 Bonita Montero <Bonita.Montero@gmail.com> - 2025-10-21 12:09 +0200
Re: u8"" c11 c23 BGB <cr88192@gmail.com> - 2025-12-16 14:59 -0600
csiph-web