Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #386268 > unrolled thread

The difference between strtol() and strtoul() ?

Started bygazelle@shell.xmission.com (Kenny McCormack)
First post2024-06-20 14:06 +0000
Last post2024-06-23 17:39 +0100
Articles 20 on this page of 50 — 11 participants

Back to article view | Back to comp.lang.c


Contents

  The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 14:06 +0000
    Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-20 14:46 +0000
      Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-20 14:37 -0700
    Re: The difference between strtol() and strtoul() ? Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2024-06-20 14:48 +0000
    Re: The difference between strtol() and strtoul() ? Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2024-06-20 15:26 +0000
    Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-20 22:55 +0000
      Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 23:35 +0000
    Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-21 13:58 +0000
      Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 18:28 +0300
        Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 18:53 +0300
          Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-21 16:14 +0000
            Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-21 16:54 +0000
              Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 06:44 +0000
                Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-22 15:16 +0000
                  Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 23:21 +0000
                  Re: The difference between strtol() and strtoul() ? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-22 20:10 -0400
          Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-21 18:15 +0100
            Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 12:19 +0300
              Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-23 12:38 +0100
                Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 15:32 +0300
                  Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-23 16:30 +0100
                    Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 18:47 +0300
                    Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 10:58 -0700
                      Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-23 21:19 +0000
                        Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 22:28 -0700
                      Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 16:01 -0700
                        Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-24 00:49 +0100
                          Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 17:49 -0700
                          Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 02:29 +0000
                            Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 02:31 +0000
                              Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 20:12 -0700
                                Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 06:05 +0000
                            Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 20:11 -0700
                              Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-24 13:19 +0300
                        Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 22:30 -0700
                    Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 00:48 +0000
          Re: The difference between strtol() and strtoul() ? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-21 14:38 -0400
            Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-21 18:43 +0000
              Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 11:47 +0300
            Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-22 21:18 +0300
        Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-21 18:02 +0100
          Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:38 -0700
            Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 06:43 +0000
            Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-22 21:04 +0300
              Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 23:22 +0000
              Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-22 16:43 -0700
      Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 19:00 +0300
        Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:50 -0700
      Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-22 22:07 +0000
    Re: The difference between strtol() and strtoul() ? Richard Kettlewell <invalid@invalid.invalid> - 2024-06-23 17:39 +0100

Page 2 of 3 — ← Prev page 1 [2] 3  Next page →


#386392

FromBen Bacarisse <ben@bsb.me.uk>
Date2024-06-23 16:30 +0100
Message-ID<87jzifpth6.fsf@bsb.me.uk>
In reply to#386385
Michael S <already5chosen@yahoo.com> writes:

> On Sun, 23 Jun 2024 12:38:51 +0100
> Ben Bacarisse <ben@bsb.me.uk> wrote:
>
>> Michael S <already5chosen@yahoo.com> writes:
>> 
>> > On Fri, 21 Jun 2024 18:15:07 +0100
>> > Ben Bacarisse <ben@bsb.me.uk> wrote:
>> >  
>> >> Michael S <already5chosen@yahoo.com> writes:
>> >>   
>> >> > On Fri, 21 Jun 2024 18:28:39 +0300
>> >> > Michael S <already5chosen@yahoo.com> wrote:
>> >> >    
>> >> >> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
>> >> >> gazelle@shell.xmission.com (Kenny McCormack) wrote:    
>> >> >> > 
>> >> >> > Yeah, now I get it.  You really only need strtoimax() and
>> >> >> > strtoumax().     
>> >> >> 
>> >> >> Which are? uunfortunately, not part of C standard.
>> >> >>     
>> >> >> > A result of any smaller type can be obtained by calling one of
>> >> >> > these functions and storing the result in an object of the
>> >> >> > smaller type.   
>> >> >> 
>> >> >> Or check for range and handle out of range values as
>> >> >> appropriate by situation.    
>> >> >
>> >> > BTW, I don't know what The Standard says about out-of-range
>> >> > inputs, but at least
>> >> > https://en.cppreference.com/w/c/string/byte/strtol does not say
>> >> > anything certain. especially about what stored in *str_end.    
>> >> 
>> >> It says what value should be returned.  That's something certain!
>> >>  
>> >
>> > In case of strtol, yes. 
>> > In case of strtoul it also says what value should be returned, but
>> > plain reading of cppreference.com text (at least *my* plain reading)
>> > does not match observed behaviour. The text on cppreference.com
>> > resembles Standard text, but does not match it.  
>> 
>> Ah.  What's the discrepancy you see?
>
> IMHO, the Standard texts allows for more interpretations (and
> misinterpretations) than cppreference.com text

I was hoping for an example.  As I've used these functions for decades,
I find it hard to see where the alternative interpretations might lie.

>> > Also, at least to me, Standard text itself appear very far from
>> > clear and way too open to interpretations.
>> > My own interpretation would be that for any negative input strtoul()
>> > should return ULONG_MAX and set errno to ERANGE. None of the actual
>> > implementation that I tested behaves in this manner.  
>> 
>> I don't get that from the text.  There is, after all, no "negative
>> input".  There is a "subject sequence" which, if it starts with a
>> minus sign, causes the "value resulting from the conversion is
>> negated (in the return type)" which seems clear enough.
>
> I find it less than clear.
> The most non-clear part is that for strtouxx() as long as "subject
> sequence" is in range,

I think it helps to be precise here: the subject sequence has to be of
the right form, not in the right range.

> it is first converted and then negated.  However
> when  "subject sequence" is out of range it is converted, then clipped
> and then *not* negated.

If the conversion (before negation) is out of range the result will be
ULONG_MAX and errno will be set to ERANGE.  Calling this "clipping" is
possibly confusing.  For what it's worth, I'm just describing what
happens.  I am not saying it is crystal clear.

I think there /is/ something problematic with the wording about the
negation.  It happens "in the return type" but how can
9223372036854775808 be negated in the type long long int?  OK, the
negated value can be /represented/ in the type long long int but that's
not quite the same thing.  On the othee hand, for the unsigned return
types, the negation "in the return type" is what produces ULONG_MAX for
"-1" when the negated value, -1, can't be /represented/ in the return
type.  It's a case where, over the years, I've just got used to what's
happening.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#386393

FromMichael S <already5chosen@yahoo.com>
Date2024-06-23 18:47 +0300
Message-ID<20240623184710.00003c36@yahoo.com>
In reply to#386392
On Sun, 23 Jun 2024 16:30:13 +0100
Ben Bacarisse <ben@bsb.me.uk> wrote:

> Michael S <already5chosen@yahoo.com> writes:
> 
> 
> As I've used these functions for
> decades, I find it hard to see where the alternative interpretations
> might lie.
>

I also use them for decades, but until last Thursday never payed
attention to what happens when they fed with OOR inputs.

[toc] | [prev] | [next] | [standalone]


#386396

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-06-23 10:58 -0700
Message-ID<864j9jh77d.fsf@linuxsc.com>
In reply to#386392
Ben Bacarisse <ben@bsb.me.uk> writes:

[range questions for strtol(), etc]

> I think there /is/ something problematic with the wording about the
> negation.  It happens "in the return type" but how can
> 9223372036854775808 be negated in the type long long int?  OK, the
> negated value can be /represented/ in the type long long int but that's
> not quite the same thing.  On the othee hand, for the unsigned return
> types, the negation "in the return type" is what produces ULONG_MAX for
> "-1" when the negated value, -1, can't be /represented/ in the return
> type.  It's a case where, over the years, I've just got used to what's
> happening.

I understand what these functions do, but their specification in the
C standard is a little off.  To my way of thinking the impact is
minimal, but the specified behavior is either unequivocally wrong or
there are some cases that give rise to undefined behavior.

[toc] | [prev] | [next] | [standalone]


#386402

Fromscott@slp53.sl.home (Scott Lurndal)
Date2024-06-23 21:19 +0000
Message-ID<Xj0eO.6172$ZwRb.2109@fx38.iad>
In reply to#386396
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>Ben Bacarisse <ben@bsb.me.uk> writes:
>
>[range questions for strtol(), etc]
>
>> I think there /is/ something problematic with the wording about the
>> negation.  It happens "in the return type" but how can
>> 9223372036854775808 be negated in the type long long int?  OK, the
>> negated value can be /represented/ in the type long long int but that's
>> not quite the same thing.  On the othee hand, for the unsigned return
>> types, the negation "in the return type" is what produces ULONG_MAX for
>> "-1" when the negated value, -1, can't be /represented/ in the return
>> type.  It's a case where, over the years, I've just got used to what's
>> happening.
>
>I understand what these functions do, but their specification in the
>C standard is a little off.  To my way of thinking the impact is
>minimal, but the specified behavior is either unequivocally wrong or
>there are some cases that give rise to undefined behavior.

I think you're both overthinking it.

[toc] | [prev] | [next] | [standalone]


#386436

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-06-23 22:28 -0700
Message-ID<86r0cmgb96.fsf@linuxsc.com>
In reply to#386402
scott@slp53.sl.home (Scott Lurndal) writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben@bsb.me.uk> writes:
>>
>> [range questions for strtol(), etc]
>>
>>> I think there /is/ something problematic with the wording about the
>>> negation.  It happens "in the return type" but how can
>>> 9223372036854775808 be negated in the type long long int?  OK, the
>>> negated value can be /represented/ in the type long long int but that's
>>> not quite the same thing.  On the othee hand, for the unsigned return
>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>> type.  It's a case where, over the years, I've just got used to what's
>>> happening.
>>
>> I understand what these functions do, but their specification in the
>> C standard is a little off.  To my way of thinking the impact is
>> minimal, but the specified behavior is either unequivocally wrong or
>> there are some cases that give rise to undefined behavior.
>
> I think you're both overthinking it.

You aren't saying anything.  Do you have something to
say that actually has positive information content?

[toc] | [prev] | [next] | [standalone]


#386411

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2024-06-23 16:01 -0700
Message-ID<87r0cntga9.fsf@nosuchdomain.example.com>
In reply to#386396
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Ben Bacarisse <ben@bsb.me.uk> writes:
> [range questions for strtol(), etc]
>
>> I think there /is/ something problematic with the wording about the
>> negation.  It happens "in the return type" but how can
>> 9223372036854775808 be negated in the type long long int?  OK, the
>> negated value can be /represented/ in the type long long int but that's
>> not quite the same thing.  On the othee hand, for the unsigned return
>> types, the negation "in the return type" is what produces ULONG_MAX for
>> "-1" when the negated value, -1, can't be /represented/ in the return
>> type.  It's a case where, over the years, I've just got used to what's
>> happening.
>
> I understand what these functions do, but their specification in the
> C standard is a little off.  To my way of thinking the impact is
> minimal, but the specified behavior is either unequivocally wrong or
> there are some cases that give rise to undefined behavior.

Can you give an example where the specified behavior causes undefined
behavior?

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#386417

FromBen Bacarisse <ben@bsb.me.uk>
Date2024-06-24 00:49 +0100
Message-ID<87o77rnrt2.fsf@bsb.me.uk>
In reply to#386411
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>> Ben Bacarisse <ben@bsb.me.uk> writes:
>> [range questions for strtol(), etc]
>>
>>> I think there /is/ something problematic with the wording about the
>>> negation.  It happens "in the return type" but how can
>>> 9223372036854775808 be negated in the type long long int?  OK, the
>>> negated value can be /represented/ in the type long long int but that's
>>> not quite the same thing.  On the othee hand, for the unsigned return
>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>> type.  It's a case where, over the years, I've just got used to what's
>>> happening.
>>
>> I understand what these functions do, but their specification in the
>> C standard is a little off.  To my way of thinking the impact is
>> minimal, but the specified behavior is either unequivocally wrong or
>> there are some cases that give rise to undefined behavior.
>
> Can you give an example where the specified behavior causes undefined
> behavior?

I don't want to pre-empt Tim's answer, but the wording that bothers me
is

  "If the subject sequence begins with a minus sign, the value resulting
  from the conversion is negated (in the return type)."

For strtoll("-9223372036854775808", 0, 0) the value resulting from the
conversion is 9223372036854775808 which can not even be represented in
the return type, so how can it be negated "in the return type"?

-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#386426

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2024-06-23 17:49 -0700
Message-ID<87ikxztbb6.fsf@nosuchdomain.example.com>
In reply to#386417
Ben Bacarisse <ben@bsb.me.uk> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Ben Bacarisse <ben@bsb.me.uk> writes:
>>> [range questions for strtol(), etc]
>>>
>>>> I think there /is/ something problematic with the wording about the
>>>> negation.  It happens "in the return type" but how can
>>>> 9223372036854775808 be negated in the type long long int?  OK, the
>>>> negated value can be /represented/ in the type long long int but that's
>>>> not quite the same thing.  On the othee hand, for the unsigned return
>>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>>> type.  It's a case where, over the years, I've just got used to what's
>>>> happening.
>>>
>>> I understand what these functions do, but their specification in the
>>> C standard is a little off.  To my way of thinking the impact is
>>> minimal, but the specified behavior is either unequivocally wrong or
>>> there are some cases that give rise to undefined behavior.
>>
>> Can you give an example where the specified behavior causes undefined
>> behavior?
>
> I don't want to pre-empt Tim's answer, but the wording that bothers me
> is
>
>   "If the subject sequence begins with a minus sign, the value resulting
>   from the conversion is negated (in the return type)."
>
> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
> conversion is 9223372036854775808 which can not even be represented in
> the return type, so how can it be negated "in the return type"?

Understanding the significance of your example requires recognizing
that number, which I didn't immediately.

I'll assume in the following that long long and intmax_t are 64 bits,
2's-complement, no padding bits.

9223372036854775808 is 2**63, and is mathematically equal to
LLONG_MAX+1.

-9223372036854775808 is mathematically equal to LLONG_MIN,
but the behavior of the strtoll() call is specified in
terms of computing 9223372036854775808 (outside the range of
long long) and then negating it.  It's obvious (I think) that
strtoll("-9223372036854775808", 0, 0) *should* return LLONG_MIN and
not set errno to ERANGE (which it does in every implementation I've
tried), but the way the standard describes it involves a semantically
impossible operation.

-9223372036854775808 is the mathematical value of LLONG_MIN, but
it's not a valid C expression (so <limits.h> typically has to use
some workaround like (-LLONG_MAX-1)) -- but we expect strtoll to
handle it in the obvious way.

Beyond this example, the wording is also problematic
for out-of-range values with a leading '-' sign, such as
strtoll("-9999999999999999999", 0, 0).  The result should be
LLONG_MIN with errno==ERANGE, but again the standard says "the value
resulting from the conversion is negated (in the return type)",
which is not actually possible.  The same applies to strtoull().

It's not surprising that implementers have inferred the intent
even if the standard doesn't precisely state it.  Still, I'd like
to see the wording made more precise.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#386429

FromKaz Kylheku <643-408-1753@kylheku.com>
Date2024-06-24 02:29 +0000
Message-ID<20240623191452.334@kylheku.com>
In reply to#386417
On 2024-06-23, Ben Bacarisse <ben@bsb.me.uk> wrote:
> I don't want to pre-empt Tim's answer, but the wording that bothers me
> is
>
>   "If the subject sequence begins with a minus sign, the value
>   resulting from the conversion is negated (in the return type)."
>
> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
> conversion is 9223372036854775808 which can not even be represented in
> the return type, so how can it be negated "in the return type"?

We have to trust that the specification wants the functions to perform
error checking, rather than precipitate into undefined behavior or
implementation-defined results.

If the negation, which is a positive value, cannot be represented in the
type, that implies it is out of range. The required behavior for a
positive out-of-range value is to return LLONG_MAX and set errno to
ERANGE.

The "in the return type" wording sounds like it may be written that way
to cover the unsigned case, strtoull.

I see in the N3220 draft that the signed and unsigned functions are
lumped together and the wording is now:

"If the subject sequence begins with a minus sign, the resulting value
is the negative of the converted value; for functions whose return type
is an unsigned integer type this action is performed in the return
type."


-- 
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

[toc] | [prev] | [next] | [standalone]


#386430

FromKaz Kylheku <643-408-1753@kylheku.com>
Date2024-06-24 02:31 +0000
Message-ID<20240623192959.369@kylheku.com>
In reply to#386429
On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
> If the negation, which is a positive value, cannot be represented in the
> type, that implies it is out of range. The required behavior for a
> positive out-of-range value is to return LLONG_MAX and set errno to
> ERANGE.

Errr, what am I saying! The negation, which is a negative value,
cannot be represented in the type, so the required behavior is to
return LLONG_MIN and set errno to negative.

-- 
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

[toc] | [prev] | [next] | [standalone]


#386435

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2024-06-23 20:12 -0700
Message-ID<87a5jbt4o7.fsf@nosuchdomain.example.com>
In reply to#386430
Kaz Kylheku <643-408-1753@kylheku.com> writes:
> On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
>> If the negation, which is a positive value, cannot be represented in the
>> type, that implies it is out of range. The required behavior for a
>> positive out-of-range value is to return LLONG_MAX and set errno to
>> ERANGE.
>
> Errr, what am I saying! The negation, which is a negative value,
> cannot be represented in the type, so the required behavior is to
> return LLONG_MIN and set errno to negative.

You mean "and set errno to ERANGE".

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#386438

FromKaz Kylheku <643-408-1753@kylheku.com>
Date2024-06-24 06:05 +0000
Message-ID<20240623230509.919@kylheku.com>
In reply to#386435
On 2024-06-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
> Kaz Kylheku <643-408-1753@kylheku.com> writes:
>> On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
>>> If the negation, which is a positive value, cannot be represented in the
>>> type, that implies it is out of range. The required behavior for a
>>> positive out-of-range value is to return LLONG_MAX and set errno to
>>> ERANGE.
>>
>> Errr, what am I saying! The negation, which is a negative value,
>> cannot be represented in the type, so the required behavior is to
>> return LLONG_MIN and set errno to negative.
>
> You mean "and set errno to ERANGE".

Once you screw up and start correcting yourself, there is no end
to the long tail of erors.

-- 
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

[toc] | [prev] | [next] | [standalone]


#386434

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2024-06-23 20:11 -0700
Message-ID<87ed8nt4qa.fsf@nosuchdomain.example.com>
In reply to#386429
Kaz Kylheku <643-408-1753@kylheku.com> writes:
> On 2024-06-23, Ben Bacarisse <ben@bsb.me.uk> wrote:
>> I don't want to pre-empt Tim's answer, but the wording that bothers me
>> is
>>
>>   "If the subject sequence begins with a minus sign, the value
>>   resulting from the conversion is negated (in the return type)."
>>
>> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
>> conversion is 9223372036854775808 which can not even be represented in
>> the return type, so how can it be negated "in the return type"?
>
> We have to trust that the specification wants the functions to perform
> error checking, rather than precipitate into undefined behavior or
> implementation-defined results.
>
> If the negation, which is a positive value, cannot be represented in the
> type, that implies it is out of range. The required behavior for a
> positive out-of-range value is to return LLONG_MAX and set errno to
> ERANGE.
>
> The "in the return type" wording sounds like it may be written that way
> to cover the unsigned case, strtoull.
>
> I see in the N3220 draft that the signed and unsigned functions are
> lumped together and the wording is now:
>
> "If the subject sequence begins with a minus sign, the resulting value
> is the negative of the converted value; for functions whose return type
> is an unsigned integer type this action is performed in the return
> type."

I should have checked the C23 draft before.  I see that the wording has
been improved.

(Note that N3220 is actually an early draft of C26.  The latest public
pre-C23 draft is N3149.  The content should be very close; I don't
believe N3220 includes any post-C23 proposed changes.)

It's fairly clear that the "value" referred to in the quoted text is a
mathematical value, which might be outside the representable range of
any C type.  The paragraph describing the returned value confirms this:
"If the correct value is outside the range of representable values ...".

So for strtoll("-9223372036854775808", NULL, 10) the "converted value"
of 9223372036854775808 exceeds LLONG_MAX, but that's ok.  That value is
negated (mathematically) yielding -9223372036854775808, which is equal
to LLONG_MIN.

There's still some ambiguity for strtoull("-9999999999999999999", NULL,
10) (that's well outside the range of a 64-bit integer).  For that to
work as expected, we have to assume that the determination that "the
correct value is outside the range of representable values" happens
*before* the negation "is performed in the return type".  It's not clear
that this problem is worth fixing (doing so would likely make that
section longer and perhaps more confusing).

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#386448

FromMichael S <already5chosen@yahoo.com>
Date2024-06-24 13:19 +0300
Message-ID<20240624131941.000057ee@yahoo.com>
In reply to#386434
On Sun, 23 Jun 2024 20:11:09 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

> 
> There's still some ambiguity for strtoull("-9999999999999999999",
> NULL, 10) (that's well outside the range of a 64-bit integer).  For
> that to work as expected, we have to assume that the determination
> that "the correct value is outside the range of representable values"
> happens *before* the negation "is performed in the return type".
> It's not clear that this problem is worth fixing (doing so would
> likely make that section longer and perhaps more confusing).
> 

There is nothing wrong with longer sections.
Personally I would prefer for each strtoxxx() function to have
its own description fully independent of all others. It would make
each of them easier to follow.
DRY is a good principle for programming, not necessarily for writing
Standards.

[toc] | [prev] | [next] | [standalone]


#386437

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-06-23 22:30 -0700
Message-ID<86msnagb5w.fsf@linuxsc.com>
In reply to#386411
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben@bsb.me.uk> writes:
>> [range questions for strtol(), etc]
>>
>>> I think there /is/ something problematic with the wording about the
>>> negation.  It happens "in the return type" but how can
>>> 9223372036854775808 be negated in the type long long int?  OK, the
>>> negated value can be /represented/ in the type long long int but that's
>>> not quite the same thing.  On the othee hand, for the unsigned return
>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>> type.  It's a case where, over the years, I've just got used to what's
>>> happening.
>>
>> I understand what these functions do, but their specification in the
>> C standard is a little off.  To my way of thinking the impact is
>> minimal, but the specified behavior is either unequivocally wrong or
>> there are some cases that give rise to undefined behavior.
>
> Can you give an example where the specified behavior causes undefined
> behavior?

Ben gave a good answer.  (My thanks to Ben for both the
content and the style of his answer.)

[toc] | [prev] | [next] | [standalone]


#386425

FromLawrence D'Oliveiro <ldo@nz.invalid>
Date2024-06-24 00:48 +0000
Message-ID<v5afob$j1nj$5@dont-email.me>
In reply to#386392
On Sun, 23 Jun 2024 16:30:13 +0100, Ben Bacarisse wrote:

> I think there /is/ something problematic with the wording about the
> negation.  It happens "in the return type" but how can
> 9223372036854775808 be negated in the type long long int?  OK, the
> negated value can be /represented/ in the type long long int but that's
> not quite the same thing.  On the othee hand, for the unsigned return
> types, the negation "in the return type" is what produces ULONG_MAX for
> "-1" when the negated value, -1, can't be /represented/ in the return
> type.  It's a case where, over the years, I've just got used to what's
> happening.

In the C23 spec, section 7.24.1.7, “The strtol, strtoll, strtoul, and 
strtoull functions”, paragraph 5 begins:

    If the subject sequence has the expected form and the value of
    base is zero, the sequence of characters starting with the first
    digit is interpreted as an integer constant according to the rules
    of 6.4.4.2.

Note this is excluding any sign. So if the non-negated value cannot be 
represented in the desired type, then there is no valid value to apply 
negation to, so according to paragraph 8, zero is returned.

[toc] | [prev] | [next] | [standalone]


#386328

FromJames Kuyper <jameskuyper@alumni.caltech.edu>
Date2024-06-21 14:38 -0400
Message-ID<v54hc0$39bpi$1@dont-email.me>
In reply to#386319
On 6/21/24 11:53, Michael S wrote:
> On Fri, 21 Jun 2024 18:28:39 +0300
> Michael S <already5chosen@yahoo.com> wrote:
> 
>> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
>> gazelle@shell.xmission.com (Kenny McCormack) wrote:
>>>
>>> Yeah, now I get it.  You really only need strtoimax() and
>>> strtoumax(). 
>>
>> Which are? uunfortunately, not part of C standard.

They have been part of the C standard since C99.

> BTW, I don't know what The Standard says about out-of-range inputs, but
> at least https://en.cppreference.com/w/c/string/byte/strtol does not
> say anything certain. especially about what stored in *str_end.

"The strtoimax and strtoumax functions are equivalent to the strtol,
strtoll, strtoul, and strtoull functions, except that the initial
portion of the string is converted to intmax_t and uintmax_t
representation, respectively." (7.8.2.3p2)

You need to go to the descriptions of those other functions to get the
detailed specifications.

"If the correct value is outside the range of representable values,
LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is
returned (according to the return type and sign of the value, if any),
and the value of the macro ERANGE is stored in errno."

As I understand it, that means that if the input string represents a
value outside of the range of representable values, then strtoimax()
should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and
strtouimax() should return UINTMAX_MAX. Both of them should store the
value of ERANGE in errno, to distinguish these results from what you
would get if the string happened to represent those values.


The C standard uses end_ptr rather than str_end in it's description of
these functions.

"... First, they decompose the input string into three parts: an
initial, possibly empty, sequence of white-space characters, a subject
sequence resembling an integer represented in some radix determined by
the value of base, and a final string of one or more unrecognized
characters, including the terminating null character of the input
string. ..." (7.21.4.7p2).

That defines what the "final string" is.

"If the subject sequence has the expected form, ... A pointer to the
final string is stored in the object pointed to by endptr, provided that
endptr is not a null pointer." (7.24.1.7p5).

"If the subject sequence is empty or does not have the expected form ...
the value of nptr is stored in the object pointed to by endptr, provided
that endptr is not a null pointer." (7.21.4.7p7)

That seems very precise and unambiguous to me, aside from what "the
expected form" is, which is described elsewhere.

[toc] | [prev] | [next] | [standalone]


#386329

Fromgazelle@shell.xmission.com (Kenny McCormack)
Date2024-06-21 18:43 +0000
Message-ID<v54hkh$2h4ra$1@news.xmission.com>
In reply to#386328
In article <v54hc0$39bpi$1@dont-email.me>,
James Kuyper  <jameskuyper@alumni.caltech.edu> wrote:
>On 6/21/24 11:53, Michael S wrote:
>> On Fri, 21 Jun 2024 18:28:39 +0300
>> Michael S <already5chosen@yahoo.com> wrote:
>> 
>>> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
>>> gazelle@shell.xmission.com (Kenny McCormack) wrote:
>>>>
>>>> Yeah, now I get it.  You really only need strtoimax() and
>>>> strtoumax(). 
>>>
>>> Which are? uunfortunately, not part of C standard.
>
>They have been part of the C standard since C99.

To some people, "Standard C" means C89.

Everything after that is, like POSIX, just fluffy nonsense.

-- 
12% of Americans think that Joan of Arc was Noah's wife.

[toc] | [prev] | [next] | [standalone]


#386375

FromMichael S <already5chosen@yahoo.com>
Date2024-06-23 11:47 +0300
Message-ID<20240623114756.0000546a@yahoo.com>
In reply to#386329
On Fri, 21 Jun 2024 18:43:29 -0000 (UTC)
gazelle@shell.xmission.com (Kenny McCormack) wrote:

> In article <v54hc0$39bpi$1@dont-email.me>,
> James Kuyper  <jameskuyper@alumni.caltech.edu> wrote:
> >On 6/21/24 11:53, Michael S wrote:  
> >> On Fri, 21 Jun 2024 18:28:39 +0300
> >> Michael S <already5chosen@yahoo.com> wrote:
> >>   
> >>> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
> >>> gazelle@shell.xmission.com (Kenny McCormack) wrote:  
> >>>>
> >>>> Yeah, now I get it.  You really only need strtoimax() and
> >>>> strtoumax().   
> >>>
> >>> Which are? uunfortunately, not part of C standard.  
> >
> >They have been part of the C standard since C99.  
> 
> To some people, "Standard C" means C89.
> 

That is not my case.
I was sincerely mistaken.


> Everything after that is, like POSIX, just fluffy nonsense.
> 

I don't think that POSIX is fluffy nonsense. I do know, however, that
POSIX is irrelevant for overwhelming majority of C programming that I
do at work.
Newer C standards are significantly more relevant, esp. language
features. 
For library features, C Standard is relevant in a sense that if
particular standard function exists in the library that I use, then it
is very likely that it matches semantics of the standard.

[toc] | [prev] | [next] | [standalone]


#386354

FromMichael S <already5chosen@yahoo.com>
Date2024-06-22 21:18 +0300
Message-ID<20240622211835.00004b62@yahoo.com>
In reply to#386328
On Fri, 21 Jun 2024 14:38:56 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

> On 6/21/24 11:53, Michael S wrote:
> > On Fri, 21 Jun 2024 18:28:39 +0300
> > Michael S <already5chosen@yahoo.com> wrote:
> >   
> >> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
> >> gazelle@shell.xmission.com (Kenny McCormack) wrote:  
> >>>
> >>> Yeah, now I get it.  You really only need strtoimax() and
> >>> strtoumax().   
> >>
> >> Which are? uunfortunately, not part of C standard.  
> 
> They have been part of the C standard since C99.
> 
> > BTW, I don't know what The Standard says about out-of-range inputs,
> > but at least https://en.cppreference.com/w/c/string/byte/strtol
> > does not say anything certain. especially about what stored in
> > *str_end.  
> 
> "The strtoimax and strtoumax functions are equivalent to the strtol,
> strtoll, strtoul, and strtoull functions, except that the initial
> portion of the string is converted to intmax_t and uintmax_t
> representation, respectively." (7.8.2.3p2)
> 
> You need to go to the descriptions of those other functions to get the
> detailed specifications.
> 
> "If the correct value is outside the range of representable values,
> LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is
> returned (according to the return type and sign of the value, if any),
> and the value of the macro ERANGE is stored in errno."
> 
> As I understand it, that means that if the input string represents a
> value outside of the range of representable values, then strtoimax()
> should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and
> strtouimax() should return UINTMAX_MAX. Both of them should store the
> value of ERANGE in errno, to distinguish these results from what you
> would get if the string happened to represent those values.
> 

That what is done by my implementation, but I can not understand how it
follows from the text, esp. for a case of out of range negative input
for strtou**() functions.
That creates rather non-intuitive discontinuity.
  strtoull("-18446744073709551615") => 1
  strtoull("-18446744073709551616") => 18446744073709551615

> 
> The C standard uses end_ptr rather than str_end in it's description of
> these functions.
> 
> "... First, they decompose the input string into three parts: an
> initial, possibly empty, sequence of white-space characters, a subject
> sequence resembling an integer represented in some radix determined by
> the value of base, and a final string of one or more unrecognized
> characters, including the terminating null character of the input
> string. ..." (7.21.4.7p2).
> 
> That defines what the "final string" is.
> 
> "If the subject sequence has the expected form, ... A pointer to the
> final string is stored in the object pointed to by endptr, provided
> that endptr is not a null pointer." (7.24.1.7p5).
> 
> "If the subject sequence is empty or does not have the expected form
> ... the value of nptr is stored in the object pointed to by endptr,
> provided that endptr is not a null pointer." (7.21.4.7p7)
> 
> That seems very precise and unambiguous to me, aside from what "the
> expected form" is, which is described elsewhere.

Yes, this part of description is good and unambiguous.  
I wonder why cppreference.com had chosen to use less clear wording "The
functions set the pointer pointed to by str_end to point to the
character past the last numeric character interpreted."





[toc] | [prev] | [next] | [standalone]


Page 2 of 3 — ← Prev page 1 [2] 3  Next page →

Back to top | Article view | comp.lang.c


csiph-web