Path: csiph.com!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Alexis Newsgroups: comp.lang.c Subject: Locales [was: Re: Rationale for aligning data on even bytes in a Unix shell file?] Date: Tue, 29 Apr 2025 20:50:14 +1000 Organization: A noiseless patient Spider Lines: 33 Message-ID: <877c338cnt.fsf_-_@gmail.com> References: MIME-Version: 1.0 Content-Type: text/plain Injection-Date: Tue, 29 Apr 2025 12:50:16 +0200 (CEST) Injection-Info: dont-email.me; posting-host="35547cee87510e193b72bc0dafd1c68d"; logging-data="1672848"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+82D+vb07XZc57T48Fi8O/S1K7I/N5IZ8=" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:WTCqI+6Gwj53EXfUwqrhqtfZ00Q= sha1:1ZjOLV6oTHYGOVn+zpeL/5aOMVg= Xref: csiph.com comp.lang.c:393042 Bonita Montero writes: > UTF-8 has a locale, the chars between 128 and 255 have the locale Latin 1. Since, as far as i can tell, no-one else has yet noted this: The 'locale' includes the part _before_ the '.'. In the output you shared, the locale is set to "C", the C/POSIX locale, using the UTF-8 'codeset' (i.e. encoding). In my own case, my locale is en_AU[.UTF-8], the locale for the English language in Australia: $ locale LANG=en_AU.UTF-8 LC_CTYPE="en_AU.UTF-8" LC_NUMERIC="en_AU.UTF-8" LC_TIME=en_AU.UTF-8 ... Informally, when people refer to the 'locale', they're usually talking about the part before the '.', not the encoding, e.g. "pl_PL" or "es_EH". The glibc manual has a section that goes into the details: > Most locale names follow XPG syntax and consist of up to four parts: > > language[_territory[.codeset]][@modifier] -- https://www.gnu.org/software/libc/manual/html_node/Locale-Names.html Alexis.