Path: csiph.com!news.swapon.de!eternal-september.org!feeder.eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: longer 'char literals' meaning in c
Date: Tue, 05 May 2020 20:06:04 -0700
Organization: None to speak of
Lines: 119
Message-ID: <878si5225f.fsf@nosuchdomain.example.com>
References: <a844dd76-5343-4845-a9d2-1d05d042baa0@googlegroups.com> <874kt061yl.fsf@nosuchdomain.example.com> <oNIqG.282039$QJp.69102@fx08.am4> <87tv104ij2.fsf@nosuchdomain.example.com> <4lKqG.254784$mk2.321@fx21.am4> <87lfmc4dwh.fsf@nosuchdomain.example.com> <r8qd3b$hmi$1@z-news.wcss.wroc.pl> <r8qntq$p8s$1@dont-email.me> <r8s6lq$k31$1@z-news.wcss.wroc.pl> <87h7wu15dv.fsf@nosuchdomain.example.com> <r8spca$q9$1@z-news.wcss.wroc.pl> <87d07i0y4r.fsf@nosuchdomain.example.com> <r8t7m9$8o1$1@z-news.wcss.wroc.pl>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="dcfcb51887ac7802667ee20ddc196f88"; logging-data="370"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18SVWTslQr+f1gV3g6g4hW9"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
Cancel-Lock: sha1:VCihVu3bi4my9xvrgNeNhSWpTew= sha1:+atyBOKR0E6b32qKcg9ohMtzzN0=
Xref: csiph.com comp.lang.c:152070

antispam@math.uni.wroc.pl writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>> antispam@math.uni.wroc.pl writes:
>> > Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>> >> antispam@math.uni.wroc.pl writes:
>> [...]
>> >> >                                        In particluar,
>> >> > it seems that compiler is allowed to reject at compile
>> >> > time literals which otherwise would represent legal
>> >> > runtime values.
>> >> 
>> >> I don't believe that's correct, but I'm not entirely sure what you
>> >> mean.  Can you provide an example (even a hypothetical one)?
>> >
>> > Two hypotetical possiblilities:
>> >
>> > - "broken" scanner, that could not produce all legal values.
>> >   I am not aware of any major compiler with such problem, but
>> >   some toy compilers had such problem.
>> >   Not likeley on normal machines, where scanner must handle
>> >   minimal range of long long, but possible for machine
>> >   with 128-bit interger type, but 64-bit literals.
>> > - simple code generator for 32-bit machine with 16-bit immediates.
>> >   Such code generator may simply embed constant literals as
>> >   immediates and reject any integer literal that does not fit
>> >   in 16 bits.  Assuming no constant folding compiler still would
>> >   be able to produce values of 32-bit constant expressions.
>> >   Point is that turning literal into expression is a bruden
>> >   on code generator and implementer may be tempted to say
>> >   that bigger literals are not implemented.  In the past
>> >   compiler had various crazy limits for no better reason.
>> 
>> Any such compiler would simply be non-conforming and buggy.
>> A conforming compiler is not "allowed" to reject constants that are
>> within the range of long long.  Equivalently, a compiler that does
>> so is not conforming.
>
> Well, so you say.  If compiler says "Out of memory" does it make
> it nonconforming?  Consider the following:
>
> 1) for simpler implementation compiler allocates too much memory
> 2) compiler uses undersized buffer/stack
> 3) compiler uses intermediate representation or code generation
>    method that is unable to represent some sensible programming
>    constructs

Sure, a compiler can always run out of memory and fail for that reason.
That's covered by N1570 1p2:

    This International Standard does not specify
    ...
    - the size or complexity of a program and its data that will exceed the
      capacity of any specific data-processing system or the capacity of a
      particular processor;
   ...

Any such capacity limits should of course be reasonable, but the
standard doesn't say so (since "reasonable" is nearly impossible to
define).

Consistently running out of memory while processing a constant like
2147483647 is not reasonable.  A compiler that does so might be
conforming as long as it can translate and execute the "one program"
described in 5.2.4.1.  It would have very poor QoI (Quality of
Implementation).  (BWT, there's no requirement for the "one program" to
be strictly conforming.)

It's no more or less reasonable than being unable to handle the string
literal "hello, world" or the identifier this_is_a_long_identifier
because it's too long.

Any competent compiler developer will be able to write code that can
correctly handle decimal integer constants up to 9223372036854775807
without falling over.

[...]

>> > #define INT_MAX __int_max_val__
>> >
>> > where __int_max_val__ is really a variable, but compiler and
>> > preprocessor magic means that during translation it is treated
>> > as a constant.
>> 
>> Then it's a constant, isn't it?
>
> Yes.  The point is that this alone does not mean that literal
> 2147483647 can be handled, (even though if handled it would give
> the same value as constructs above).

No, that alone doesn't mean that the constant 2147483647 can be handled.
Other clauses of the standard ensure that (barring absurd capacity
limits).

[...]

>> Since there are no negative integer constants, defining INT_MIN is
>> *slightly* tricky.  With 16-bit int, this:
>>     #define INT_MIN (-32768)
>> is non-conforming.  Solutions are well known, for example:
>>     #define INT_MIN (-32767-1)
>> which meets the requirements of the standard.  What's the problem?
>
> Point is that INT_MIN naturally may be a special case.  If allowed
> to impose limit implementer may say that values of INT_MIN is outside
> implementation limits (for the specific construct that under
> discussion).

I'm not sure what your point is here.  A conforming implementation
must define INT_MIN as a macro that expands to a constant expression
of type int suitable for use in a #if preprocessing directive.
It can do that any way it likes.  The vagaries of integer constants
mean that it's not *quite* as straightforward as defining INT_MAX,
but it's a solved problem.  No actual implementation would fail to
define INT_MIN and INT_MAX correctly.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */