Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.std.c > #6185

Re: Add @ to basic character set?

From Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups comp.std.c
Subject Re: Add @ to basic character set?
Date 2020-12-07 13:10 -0800
Organization None to speak of
Message-ID <87r1o1e2zg.fsf@nosuchdomain.example.com> (permalink)
References (7 earlier) <D9dzH.185712$ql4.125078@fx39.iad> <87czzmfqb3.fsf@nosuchdomain.example.com> <4apzH.236649$xe4.230701@fx41.iad> <874kkxfk35.fsf@nosuchdomain.example.com> <rBwzH.225$y74.202@fx36.iad>

Show all headers | View raw


Richard Damon <Richard@Damon-Family.org> writes:
> On 12/7/20 3:16 PM, Keith Thompson wrote:
>> Richard Damon <Richard@Damon-Family.org> writes:
>>> On 12/6/20 6:49 PM, Keith Thompson wrote:
>>>> Richard Damon <Richard@Damon-Family.org> writes:
>>>>> On 12/6/20 5:07 PM, Keith Thompson wrote:
>>>>>> Richard Damon <Richard@Damon-Family.org> writes:
>>>>>> [...]
>>>>>>> The issue with making them part of the basic character set is that it
>>>>>>> makes any system that can't do this, because it uses a strange character
>>>>>>> set, non-conforming. Since systems ARE allowed to add any characters
>>>>>>> they want to the source or execution character set, those that currently
>>>>>>> support them can do so. Forcing them to be included drops some system
>>>>>>> from being able to have a conforming implementation, and the committee
>>>>>>> has traditionally avoided gratuitously making systems non-conforming.
>>>>>>
>>>>>> (Context: The ASCII characters '@', '$', and '`'.)
>>>>>>
>>>>>> I'd be interested in seeing an implementation for which this would
>>>>>> be relevant.  Such an implementation (a) would be unable to (easily)
>>>>>> represent those three character in source code and/or during
>>>>>> execution *and* (b) would otherwise conform to the hypothetical
>>>>>> edition of the C standard that would add them to the basic character
>>>>>> set if it were not for this change.
>>>>>
>>>>> As was mentioned, all that you need is to want to support ISO/IEC 646
>>>>> for a naional character set that doesn't define code point 64 as @
>>>>>
>>>>> This includes Canadian, French, German, Irish, and a number of others.
>>>>>
>>>>> See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
>>>>
>>>> What C implementations support those character sets (and are likely to
>>>> attempt to conform to a future C standard that adds '@' to the basic
>>>> character set)?
>>>
>>> gcc (and many others) with the right choice of file encoding options.
>>> The key point here is that this change would be telling a number of
>>> national bodies that their whole national character set (and thus in
>>> some respects their language) will no longer be supported.
>> 
>> OK.  Can you explain precisely how to invoke gcc with the right choice
>> of file encoding options?  I've found this option in the gcc manual:
>> 
>> '-finput-charset=CHARSET'
>>      Set the input character set, used for translation from the
>>      character set of the input file to the source character set used by
>>      GCC.  If the locale does not specify, or GCC cannot get this
>>      information from the locale, the default is UTF-8.  This can be
>>      overridden by either the locale or this command-line option.
>>      Currently the command-line option takes precedence if there's a
>>      conflict.  CHARSET can be any encoding supported by the system's
>>      'iconv' library routine.
>> 
>> but I had never used it.
>> 
>> I just used "iconv -l" to get what I presume is a list of valid CHARSET
>> values (there are over 1000 of them), which led me to this:
>> 
>>     gcc -std=c11 -pedantic-errors -finput-charset=ISO646-FR -c c.c
>> 
>> With this source file:
>> 
>>     #include <stdio.h>
>>     int main(void) {
>>         puts("$@`");
>>     }
>> 
>> it produced a cascade of errors, starting with:
>> 
>>     In file included from <command-line>:31:
>>     /usr/include/stdc-predef.h:18:1: error: stray ‘\302’ in program
>>        18 | #ifndef _STDC_PREDEF_H
>>              | ^
>> 
>> It looks like something translated the # character to \302 (0xc2).
>> I have no idea why.  (And it didn't complain about "$@`".)
>> 
>> If there's a way to invoke gcc telling it to use a character set that
>> doesn't include those characters, that would be a good refutation
>> to my point.  If doing so is actually useful in some contexts,
>> it would be an even better refutation.  So far I'm not convinced,
>> but I'm prepared to be.
>> 
>> My impression is that the old 7-bit national character sets are
>> no longer relevant, and that dropping support for them in the
>> C standard (more precisely, updating the C standard in a manner
>> that's inconsistent with those character sets) would be very nearly
>> harmless.  I'm looking for evidence that that's not the case.
>> 
>> [...]
>> 
>
> One problem is that file is NOT compatible with ISO646-FR as the '#'
> character in it would not be a HashTag (or Pound Sign), but would be the
> character £ which is illegal in C. It is one of the encodings that NEEDS
> the trigraphs or digraphs in the files to use C.

The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.

I'd still be interested in seeing an existing implementation that
does support ISO646-FR or something similar, and that would become
non-conforming if '@' were made part of the basic character set.

I recognize that the burden of proof is on any proposal to make a
change to the standard, but so far I've seen no evidence that such a
change would actually break anything (at least anything that isn't
already broken).

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Back to comp.std.c | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2020-12-05 08:58 +0100
  Re: Add @ to basic character set? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-12-05 10:53 -0500
    Re: Add @ to basic character set? David Brown <david.brown@hesbynett.no> - 2020-12-05 17:15 +0100
      Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2020-12-05 20:55 +0100
    Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-05 14:17 -0800
      Re: Add @ to basic character set? Francis Glassborow <francis.glassborow@btinternet.com> - 2020-12-06 12:25 +0000
        Re: Add @ to basic character set? David Brown <david.brown@hesbynett.no> - 2020-12-06 13:47 +0100
          Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-06 08:42 -0500
            Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-06 14:07 -0800
              Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-06 17:44 -0500
                Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-06 15:49 -0800
                Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2020-12-07 09:31 +0100
                Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-07 07:24 -0500
                Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-07 12:16 -0800
                Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-07 15:51 -0500
                Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-07 13:10 -0800
                Re: Add @ to basic character set? Andreas Schwab <schwab@linux-m68k.org> - 2020-12-07 23:52 +0100
                Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-07 15:27 -0800
                Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-07 18:54 -0500
                Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-07 16:10 -0800
                Re: Add @ to basic character set? Richard Damon <Richard@Damon-Family.org> - 2020-12-07 18:31 -0500
                Re: Add @ to basic character set? Andreas Schwab <schwab@linux-m68k.org> - 2020-12-07 23:08 +0100
              Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2020-12-07 09:30 +0100
      Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2020-12-07 09:17 +0100
  Re: Add @ to basic character set? Thomas David Rivers <rivers@dignus.com> - 2020-12-06 16:11 -0500
    Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-12-07 12:19 -0800
      Re: Add @ to basic character set? Thomas David Rivers <rivers@dignus.com> - 2020-12-07 17:02 -0500
  Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2021-03-11 22:50 +0100
    Re: Add @ to basic character set? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2021-03-11 15:40 -0800
      Re: Add @ to basic character set? Philipp Klaus Krause <pkk@spth.de> - 2021-03-12 15:25 +0100
    Re: Add @ to basic character set? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2021-07-10 08:46 -0700

csiph-web