Path: csiph.com!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Keith Thompson Newsgroups: comp.lang.c Subject: Re: The integral type 'byte' (was Re: Suggested method for returning a string from a C program?) Date: Mon, 24 Mar 2025 15:57:55 -0700 Organization: None to speak of Lines: 52 Message-ID: <874izi82a4.fsf@nosuchdomain.example.com> References: <868qp1ra5f.fsf@linuxsc.com> <20250319115550.0000676f@yahoo.com> <20250319201903.00005452@yahoo.com> <86r02roqdq.fsf@linuxsc.com> <20250320204642.0000423a@yahoo.com> <87iko3s3h2.fsf@nosuchdomain.example.com> MIME-Version: 1.0 Content-Type: text/plain Injection-Date: Mon, 24 Mar 2025 23:57:56 +0100 (CET) Injection-Info: dont-email.me; posting-host="20bea192f522c62899c963bd14c0f843"; logging-data="1971587"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Leo9hiW7t8DAFnqkIW0pw" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:55g9pp/KoaSPkQYqA8bO5KQgeq4= sha1:FvProCjH6e6iV4LZ9+gX/Ah8Q8Y= Xref: csiph.com comp.lang.c:391581 Janis Papanagnou writes: > On 21.03.2025 00:10, Keith Thompson wrote: >> bart writes: >>> [...] >>> Look at this one for example: >>> >>> typedef uint8_t byte; // from arduino.h >>> >>> I can only one of reason this exists, which is that 'byte' is a far >>> nicer denotation. >> >> I agree in this case. "byte" documents what the type is intended for. > > I disagree on both above expressed opinions in more than one way. > > Byte is a bad term to denote a quantity or an intention. Formerly > a "Byte" was used to carry characters; its size could be anything > from 5 to 9 bit. There was a reason why in international standards > documents there's the 'octet' introduced to unambiguously hint to > an 8-bit quantity. Neither is it good, as we see in practice, to > assume a 'byte' (whatever it actually is) to be able to carry a > character, not even 'char' or 'unsigned char' seem to be able to > accomplish that given the "wide character" types in the context of > Unicode (16 bit, 32 bit) characters and (variable-length) UTF-8 > encodings. "Byte" is a defined term in C. The definition is "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment", but other parts of the standard make it clear that sizeof yields a count of bytes, that there are CHAR_BIT bits in a byte, that types char, unsigned char, and signed char are all one byte in size, and that CHAR_BIT is at least 8 but can be bigger. I'm aware of the history, but if I defined a "byte" type in C that's what I would mean. It is IMHO unfortunate that "bytes" and "characters" are conflated in C. This was done before multi-byte or wide characters were a thing, but we're stuck with it. The definition above: >>> typedef uint8_t byte; // from arduino.h is IMHO not ideal. Various language rules taken together imply that uint8_t *either* is exactly one byte *or* does not exist (if CHAR_BIT>8), but unsigned char is directly specified to be exactly one byte. But it's system-specific, so I wouldn't worry about it or advocate changing it. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com void Void(void) { Void(); } /* The recursive call of the void */