Path: csiph.com!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch Newsgroups: comp.lang.c Subject: Re: C vs Haskell for XML parsing Date: Wed, 30 Aug 2023 21:09:06 -0700 Organization: A noiseless patient Spider Lines: 37 Message-ID: <86il8vua3h.fsf@linuxsc.com> References: <576801fa-2842-40dc-bf19-221a5b1cf660n@googlegroups.com> <20230826123929.770@kylheku.com> <20230826210521.20@kylheku.com> <20230827151627.814@kylheku.com> <87edjocbqj.fsf@nosuchdomain.example.com> <86edjnxo81.fsf@linuxsc.com> <87ledubyeh.fsf@nosuchdomain.example.com> <861qfmwwvy.fsf@linuxsc.com> <20230828182115.305@kylheku.com> <875y4xboly.fsf@nosuchdomain.example.com> <864jkhun64.fsf@linuxsc.com> <87sf819bi3.fsf@nosuchdomain.example.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: dont-email.me; posting-host="49a0c7fba7d7c0f06cea865d80b29294"; logging-data="3276242"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ooRxmBfAgw4hDp/WceK/+lCV6rH/MkqM=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:dsH8NfM5feyYqIFSvyYkr+TP4nA= sha1:x5W6ilXLhVU33wUG7V8N42iKHMk= Xref: csiph.com comp.lang.c:173370 Keith Thompson writes: > Tim Rentsch writes: > >> Keith Thompson writes: >> >>> David Brown writes: >>> [...] >>> >>>> Being able to accept $ in identifiers is a convenient extension. >>> >>> Quibble: $ in identifiers is not an extension as specified in section 4 >>> of the standard. Starting in C99, the set of characters accepted in >>> identifiers is implementation-defined. (I'm not sure what difference >>> that makes.) >> >> In Annex J, J.5.2 gives adding $ to the set of characters that >> can appear in identifiers as an example of a common extension. > > C90 says the same thing in its Annex G. Looks like they didn't update > it when C99 updated the syntax for an identifier. Certainly it's true that this passage wasn't changed. That doesn't mean it wasn't reviewed; it may have been deliberately left in. There have been lots of opportunities to change it, and it hasn't been changed yet. > (Though I suppose an > implementation could accept $ in identifiers either as an extension or > as an "other implementation-defined character".) Yes, no question about it. Furthermore implementors might make one choice or the other, depending on how they think of the character(s) in question. For myself, I expect I would naturally think of adding characters from another alphabet in the implementation-defined sense, but think of adding $ in the extension sense. Having both options has some value, in terms of conveying mindset.