Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #386268 > unrolled thread
| Started by | gazelle@shell.xmission.com (Kenny McCormack) |
|---|---|
| First post | 2024-06-20 14:06 +0000 |
| Last post | 2024-06-23 17:39 +0100 |
| Articles | 20 on this page of 50 — 11 participants |
Back to article view | Back to comp.lang.c
The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 14:06 +0000
Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-20 14:46 +0000
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-20 14:37 -0700
Re: The difference between strtol() and strtoul() ? Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2024-06-20 14:48 +0000
Re: The difference between strtol() and strtoul() ? Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2024-06-20 15:26 +0000
Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-20 22:55 +0000
Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 23:35 +0000
Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-21 13:58 +0000
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 18:28 +0300
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 18:53 +0300
Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-21 16:14 +0000
Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-21 16:54 +0000
Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 06:44 +0000
Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-22 15:16 +0000
Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 23:21 +0000
Re: The difference between strtol() and strtoul() ? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-22 20:10 -0400
Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-21 18:15 +0100
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 12:19 +0300
Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-23 12:38 +0100
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 15:32 +0300
Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-23 16:30 +0100
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 18:47 +0300
Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 10:58 -0700
Re: The difference between strtol() and strtoul() ? scott@slp53.sl.home (Scott Lurndal) - 2024-06-23 21:19 +0000
Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 22:28 -0700
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 16:01 -0700
Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-24 00:49 +0100
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 17:49 -0700
Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 02:29 +0000
Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 02:31 +0000
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 20:12 -0700
Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-24 06:05 +0000
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-23 20:11 -0700
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-24 13:19 +0300
Re: The difference between strtol() and strtoul() ? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 22:30 -0700
Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 00:48 +0000
Re: The difference between strtol() and strtoul() ? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-21 14:38 -0400
Re: The difference between strtol() and strtoul() ? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-21 18:43 +0000
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-23 11:47 +0300
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-22 21:18 +0300
Re: The difference between strtol() and strtoul() ? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-21 18:02 +0100
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:38 -0700
Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 06:43 +0000
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-22 21:04 +0300
Re: The difference between strtol() and strtoul() ? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-22 23:22 +0000
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-22 16:43 -0700
Re: The difference between strtol() and strtoul() ? Michael S <already5chosen@yahoo.com> - 2024-06-21 19:00 +0300
Re: The difference between strtol() and strtoul() ? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:50 -0700
Re: The difference between strtol() and strtoul() ? Kaz Kylheku <643-408-1753@kylheku.com> - 2024-06-22 22:07 +0000
Re: The difference between strtol() and strtoul() ? Richard Kettlewell <invalid@invalid.invalid> - 2024-06-23 17:39 +0100
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
| From | Ben Bacarisse <ben@bsb.me.uk> |
|---|---|
| Date | 2024-06-23 16:30 +0100 |
| Message-ID | <87jzifpth6.fsf@bsb.me.uk> |
| In reply to | #386385 |
Michael S <already5chosen@yahoo.com> writes: > On Sun, 23 Jun 2024 12:38:51 +0100 > Ben Bacarisse <ben@bsb.me.uk> wrote: > >> Michael S <already5chosen@yahoo.com> writes: >> >> > On Fri, 21 Jun 2024 18:15:07 +0100 >> > Ben Bacarisse <ben@bsb.me.uk> wrote: >> > >> >> Michael S <already5chosen@yahoo.com> writes: >> >> >> >> > On Fri, 21 Jun 2024 18:28:39 +0300 >> >> > Michael S <already5chosen@yahoo.com> wrote: >> >> > >> >> >> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC) >> >> >> gazelle@shell.xmission.com (Kenny McCormack) wrote: >> >> >> > >> >> >> > Yeah, now I get it. You really only need strtoimax() and >> >> >> > strtoumax(). >> >> >> >> >> >> Which are? uunfortunately, not part of C standard. >> >> >> >> >> >> > A result of any smaller type can be obtained by calling one of >> >> >> > these functions and storing the result in an object of the >> >> >> > smaller type. >> >> >> >> >> >> Or check for range and handle out of range values as >> >> >> appropriate by situation. >> >> > >> >> > BTW, I don't know what The Standard says about out-of-range >> >> > inputs, but at least >> >> > https://en.cppreference.com/w/c/string/byte/strtol does not say >> >> > anything certain. especially about what stored in *str_end. >> >> >> >> It says what value should be returned. That's something certain! >> >> >> > >> > In case of strtol, yes. >> > In case of strtoul it also says what value should be returned, but >> > plain reading of cppreference.com text (at least *my* plain reading) >> > does not match observed behaviour. The text on cppreference.com >> > resembles Standard text, but does not match it. >> >> Ah. What's the discrepancy you see? > > IMHO, the Standard texts allows for more interpretations (and > misinterpretations) than cppreference.com text I was hoping for an example. As I've used these functions for decades, I find it hard to see where the alternative interpretations might lie. >> > Also, at least to me, Standard text itself appear very far from >> > clear and way too open to interpretations. >> > My own interpretation would be that for any negative input strtoul() >> > should return ULONG_MAX and set errno to ERANGE. None of the actual >> > implementation that I tested behaves in this manner. >> >> I don't get that from the text. There is, after all, no "negative >> input". There is a "subject sequence" which, if it starts with a >> minus sign, causes the "value resulting from the conversion is >> negated (in the return type)" which seems clear enough. > > I find it less than clear. > The most non-clear part is that for strtouxx() as long as "subject > sequence" is in range, I think it helps to be precise here: the subject sequence has to be of the right form, not in the right range. > it is first converted and then negated. However > when "subject sequence" is out of range it is converted, then clipped > and then *not* negated. If the conversion (before negation) is out of range the result will be ULONG_MAX and errno will be set to ERANGE. Calling this "clipping" is possibly confusing. For what it's worth, I'm just describing what happens. I am not saying it is crystal clear. I think there /is/ something problematic with the wording about the negation. It happens "in the return type" but how can 9223372036854775808 be negated in the type long long int? OK, the negated value can be /represented/ in the type long long int but that's not quite the same thing. On the othee hand, for the unsigned return types, the negation "in the return type" is what produces ULONG_MAX for "-1" when the negated value, -1, can't be /represented/ in the return type. It's a case where, over the years, I've just got used to what's happening. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-06-23 18:47 +0300 |
| Message-ID | <20240623184710.00003c36@yahoo.com> |
| In reply to | #386392 |
On Sun, 23 Jun 2024 16:30:13 +0100 Ben Bacarisse <ben@bsb.me.uk> wrote: > Michael S <already5chosen@yahoo.com> writes: > > > As I've used these functions for > decades, I find it hard to see where the alternative interpretations > might lie. > I also use them for decades, but until last Thursday never payed attention to what happens when they fed with OOR inputs.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-23 10:58 -0700 |
| Message-ID | <864j9jh77d.fsf@linuxsc.com> |
| In reply to | #386392 |
Ben Bacarisse <ben@bsb.me.uk> writes: [range questions for strtol(), etc] > I think there /is/ something problematic with the wording about the > negation. It happens "in the return type" but how can > 9223372036854775808 be negated in the type long long int? OK, the > negated value can be /represented/ in the type long long int but that's > not quite the same thing. On the othee hand, for the unsigned return > types, the negation "in the return type" is what produces ULONG_MAX for > "-1" when the negated value, -1, can't be /represented/ in the return > type. It's a case where, over the years, I've just got used to what's > happening. I understand what these functions do, but their specification in the C standard is a little off. To my way of thinking the impact is minimal, but the specified behavior is either unequivocally wrong or there are some cases that give rise to undefined behavior.
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2024-06-23 21:19 +0000 |
| Message-ID | <Xj0eO.6172$ZwRb.2109@fx38.iad> |
| In reply to | #386396 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >Ben Bacarisse <ben@bsb.me.uk> writes: > >[range questions for strtol(), etc] > >> I think there /is/ something problematic with the wording about the >> negation. It happens "in the return type" but how can >> 9223372036854775808 be negated in the type long long int? OK, the >> negated value can be /represented/ in the type long long int but that's >> not quite the same thing. On the othee hand, for the unsigned return >> types, the negation "in the return type" is what produces ULONG_MAX for >> "-1" when the negated value, -1, can't be /represented/ in the return >> type. It's a case where, over the years, I've just got used to what's >> happening. > >I understand what these functions do, but their specification in the >C standard is a little off. To my way of thinking the impact is >minimal, but the specified behavior is either unequivocally wrong or >there are some cases that give rise to undefined behavior. I think you're both overthinking it.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-23 22:28 -0700 |
| Message-ID | <86r0cmgb96.fsf@linuxsc.com> |
| In reply to | #386402 |
scott@slp53.sl.home (Scott Lurndal) writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Ben Bacarisse <ben@bsb.me.uk> writes: >> >> [range questions for strtol(), etc] >> >>> I think there /is/ something problematic with the wording about the >>> negation. It happens "in the return type" but how can >>> 9223372036854775808 be negated in the type long long int? OK, the >>> negated value can be /represented/ in the type long long int but that's >>> not quite the same thing. On the othee hand, for the unsigned return >>> types, the negation "in the return type" is what produces ULONG_MAX for >>> "-1" when the negated value, -1, can't be /represented/ in the return >>> type. It's a case where, over the years, I've just got used to what's >>> happening. >> >> I understand what these functions do, but their specification in the >> C standard is a little off. To my way of thinking the impact is >> minimal, but the specified behavior is either unequivocally wrong or >> there are some cases that give rise to undefined behavior. > > I think you're both overthinking it. You aren't saying anything. Do you have something to say that actually has positive information content?
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-23 16:01 -0700 |
| Message-ID | <87r0cntga9.fsf@nosuchdomain.example.com> |
| In reply to | #386396 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Ben Bacarisse <ben@bsb.me.uk> writes:
> [range questions for strtol(), etc]
>
>> I think there /is/ something problematic with the wording about the
>> negation. It happens "in the return type" but how can
>> 9223372036854775808 be negated in the type long long int? OK, the
>> negated value can be /represented/ in the type long long int but that's
>> not quite the same thing. On the othee hand, for the unsigned return
>> types, the negation "in the return type" is what produces ULONG_MAX for
>> "-1" when the negated value, -1, can't be /represented/ in the return
>> type. It's a case where, over the years, I've just got used to what's
>> happening.
>
> I understand what these functions do, but their specification in the
> C standard is a little off. To my way of thinking the impact is
> minimal, but the specified behavior is either unequivocally wrong or
> there are some cases that give rise to undefined behavior.
Can you give an example where the specified behavior causes undefined
behavior?
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben@bsb.me.uk> |
|---|---|
| Date | 2024-06-24 00:49 +0100 |
| Message-ID | <87o77rnrt2.fsf@bsb.me.uk> |
| In reply to | #386411 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>> Ben Bacarisse <ben@bsb.me.uk> writes:
>> [range questions for strtol(), etc]
>>
>>> I think there /is/ something problematic with the wording about the
>>> negation. It happens "in the return type" but how can
>>> 9223372036854775808 be negated in the type long long int? OK, the
>>> negated value can be /represented/ in the type long long int but that's
>>> not quite the same thing. On the othee hand, for the unsigned return
>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>> type. It's a case where, over the years, I've just got used to what's
>>> happening.
>>
>> I understand what these functions do, but their specification in the
>> C standard is a little off. To my way of thinking the impact is
>> minimal, but the specified behavior is either unequivocally wrong or
>> there are some cases that give rise to undefined behavior.
>
> Can you give an example where the specified behavior causes undefined
> behavior?
I don't want to pre-empt Tim's answer, but the wording that bothers me
is
"If the subject sequence begins with a minus sign, the value resulting
from the conversion is negated (in the return type)."
For strtoll("-9223372036854775808", 0, 0) the value resulting from the
conversion is 9223372036854775808 which can not even be represented in
the return type, so how can it be negated "in the return type"?
--
Ben.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-23 17:49 -0700 |
| Message-ID | <87ikxztbb6.fsf@nosuchdomain.example.com> |
| In reply to | #386417 |
Ben Bacarisse <ben@bsb.me.uk> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Ben Bacarisse <ben@bsb.me.uk> writes:
>>> [range questions for strtol(), etc]
>>>
>>>> I think there /is/ something problematic with the wording about the
>>>> negation. It happens "in the return type" but how can
>>>> 9223372036854775808 be negated in the type long long int? OK, the
>>>> negated value can be /represented/ in the type long long int but that's
>>>> not quite the same thing. On the othee hand, for the unsigned return
>>>> types, the negation "in the return type" is what produces ULONG_MAX for
>>>> "-1" when the negated value, -1, can't be /represented/ in the return
>>>> type. It's a case where, over the years, I've just got used to what's
>>>> happening.
>>>
>>> I understand what these functions do, but their specification in the
>>> C standard is a little off. To my way of thinking the impact is
>>> minimal, but the specified behavior is either unequivocally wrong or
>>> there are some cases that give rise to undefined behavior.
>>
>> Can you give an example where the specified behavior causes undefined
>> behavior?
>
> I don't want to pre-empt Tim's answer, but the wording that bothers me
> is
>
> "If the subject sequence begins with a minus sign, the value resulting
> from the conversion is negated (in the return type)."
>
> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
> conversion is 9223372036854775808 which can not even be represented in
> the return type, so how can it be negated "in the return type"?
Understanding the significance of your example requires recognizing
that number, which I didn't immediately.
I'll assume in the following that long long and intmax_t are 64 bits,
2's-complement, no padding bits.
9223372036854775808 is 2**63, and is mathematically equal to
LLONG_MAX+1.
-9223372036854775808 is mathematically equal to LLONG_MIN,
but the behavior of the strtoll() call is specified in
terms of computing 9223372036854775808 (outside the range of
long long) and then negating it. It's obvious (I think) that
strtoll("-9223372036854775808", 0, 0) *should* return LLONG_MIN and
not set errno to ERANGE (which it does in every implementation I've
tried), but the way the standard describes it involves a semantically
impossible operation.
-9223372036854775808 is the mathematical value of LLONG_MIN, but
it's not a valid C expression (so <limits.h> typically has to use
some workaround like (-LLONG_MAX-1)) -- but we expect strtoll to
handle it in the obvious way.
Beyond this example, the wording is also problematic
for out-of-range values with a leading '-' sign, such as
strtoll("-9999999999999999999", 0, 0). The result should be
LLONG_MIN with errno==ERANGE, but again the standard says "the value
resulting from the conversion is negated (in the return type)",
which is not actually possible. The same applies to strtoull().
It's not surprising that implementers have inferred the intent
even if the standard doesn't precisely state it. Still, I'd like
to see the wording made more precise.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Kaz Kylheku <643-408-1753@kylheku.com> |
|---|---|
| Date | 2024-06-24 02:29 +0000 |
| Message-ID | <20240623191452.334@kylheku.com> |
| In reply to | #386417 |
On 2024-06-23, Ben Bacarisse <ben@bsb.me.uk> wrote:
> I don't want to pre-empt Tim's answer, but the wording that bothers me
> is
>
> "If the subject sequence begins with a minus sign, the value
> resulting from the conversion is negated (in the return type)."
>
> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
> conversion is 9223372036854775808 which can not even be represented in
> the return type, so how can it be negated "in the return type"?
We have to trust that the specification wants the functions to perform
error checking, rather than precipitate into undefined behavior or
implementation-defined results.
If the negation, which is a positive value, cannot be represented in the
type, that implies it is out of range. The required behavior for a
positive out-of-range value is to return LLONG_MAX and set errno to
ERANGE.
The "in the return type" wording sounds like it may be written that way
to cover the unsigned case, strtoull.
I see in the N3220 draft that the signed and unsigned functions are
lumped together and the wording is now:
"If the subject sequence begins with a minus sign, the resulting value
is the negative of the converted value; for functions whose return type
is an unsigned integer type this action is performed in the return
type."
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
[toc] | [prev] | [next] | [standalone]
| From | Kaz Kylheku <643-408-1753@kylheku.com> |
|---|---|
| Date | 2024-06-24 02:31 +0000 |
| Message-ID | <20240623192959.369@kylheku.com> |
| In reply to | #386429 |
On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote: > If the negation, which is a positive value, cannot be represented in the > type, that implies it is out of range. The required behavior for a > positive out-of-range value is to return LLONG_MAX and set errno to > ERANGE. Errr, what am I saying! The negation, which is a negative value, cannot be represented in the type, so the required behavior is to return LLONG_MIN and set errno to negative. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-23 20:12 -0700 |
| Message-ID | <87a5jbt4o7.fsf@nosuchdomain.example.com> |
| In reply to | #386430 |
Kaz Kylheku <643-408-1753@kylheku.com> writes:
> On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote:
>> If the negation, which is a positive value, cannot be represented in the
>> type, that implies it is out of range. The required behavior for a
>> positive out-of-range value is to return LLONG_MAX and set errno to
>> ERANGE.
>
> Errr, what am I saying! The negation, which is a negative value,
> cannot be represented in the type, so the required behavior is to
> return LLONG_MIN and set errno to negative.
You mean "and set errno to ERANGE".
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Kaz Kylheku <643-408-1753@kylheku.com> |
|---|---|
| Date | 2024-06-24 06:05 +0000 |
| Message-ID | <20240623230509.919@kylheku.com> |
| In reply to | #386435 |
On 2024-06-24, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: > Kaz Kylheku <643-408-1753@kylheku.com> writes: >> On 2024-06-24, Kaz Kylheku <643-408-1753@kylheku.com> wrote: >>> If the negation, which is a positive value, cannot be represented in the >>> type, that implies it is out of range. The required behavior for a >>> positive out-of-range value is to return LLONG_MAX and set errno to >>> ERANGE. >> >> Errr, what am I saying! The negation, which is a negative value, >> cannot be represented in the type, so the required behavior is to >> return LLONG_MIN and set errno to negative. > > You mean "and set errno to ERANGE". Once you screw up and start correcting yourself, there is no end to the long tail of erors. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-23 20:11 -0700 |
| Message-ID | <87ed8nt4qa.fsf@nosuchdomain.example.com> |
| In reply to | #386429 |
Kaz Kylheku <643-408-1753@kylheku.com> writes:
> On 2024-06-23, Ben Bacarisse <ben@bsb.me.uk> wrote:
>> I don't want to pre-empt Tim's answer, but the wording that bothers me
>> is
>>
>> "If the subject sequence begins with a minus sign, the value
>> resulting from the conversion is negated (in the return type)."
>>
>> For strtoll("-9223372036854775808", 0, 0) the value resulting from the
>> conversion is 9223372036854775808 which can not even be represented in
>> the return type, so how can it be negated "in the return type"?
>
> We have to trust that the specification wants the functions to perform
> error checking, rather than precipitate into undefined behavior or
> implementation-defined results.
>
> If the negation, which is a positive value, cannot be represented in the
> type, that implies it is out of range. The required behavior for a
> positive out-of-range value is to return LLONG_MAX and set errno to
> ERANGE.
>
> The "in the return type" wording sounds like it may be written that way
> to cover the unsigned case, strtoull.
>
> I see in the N3220 draft that the signed and unsigned functions are
> lumped together and the wording is now:
>
> "If the subject sequence begins with a minus sign, the resulting value
> is the negative of the converted value; for functions whose return type
> is an unsigned integer type this action is performed in the return
> type."
I should have checked the C23 draft before. I see that the wording has
been improved.
(Note that N3220 is actually an early draft of C26. The latest public
pre-C23 draft is N3149. The content should be very close; I don't
believe N3220 includes any post-C23 proposed changes.)
It's fairly clear that the "value" referred to in the quoted text is a
mathematical value, which might be outside the representable range of
any C type. The paragraph describing the returned value confirms this:
"If the correct value is outside the range of representable values ...".
So for strtoll("-9223372036854775808", NULL, 10) the "converted value"
of 9223372036854775808 exceeds LLONG_MAX, but that's ok. That value is
negated (mathematically) yielding -9223372036854775808, which is equal
to LLONG_MIN.
There's still some ambiguity for strtoull("-9999999999999999999", NULL,
10) (that's well outside the range of a 64-bit integer). For that to
work as expected, we have to assume that the determination that "the
correct value is outside the range of representable values" happens
*before* the negation "is performed in the return type". It's not clear
that this problem is worth fixing (doing so would likely make that
section longer and perhaps more confusing).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-06-24 13:19 +0300 |
| Message-ID | <20240624131941.000057ee@yahoo.com> |
| In reply to | #386434 |
On Sun, 23 Jun 2024 20:11:09 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>
> There's still some ambiguity for strtoull("-9999999999999999999",
> NULL, 10) (that's well outside the range of a 64-bit integer). For
> that to work as expected, we have to assume that the determination
> that "the correct value is outside the range of representable values"
> happens *before* the negation "is performed in the return type".
> It's not clear that this problem is worth fixing (doing so would
> likely make that section longer and perhaps more confusing).
>
There is nothing wrong with longer sections.
Personally I would prefer for each strtoxxx() function to have
its own description fully independent of all others. It would make
each of them easier to follow.
DRY is a good principle for programming, not necessarily for writing
Standards.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-23 22:30 -0700 |
| Message-ID | <86msnagb5w.fsf@linuxsc.com> |
| In reply to | #386411 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Ben Bacarisse <ben@bsb.me.uk> writes: >> [range questions for strtol(), etc] >> >>> I think there /is/ something problematic with the wording about the >>> negation. It happens "in the return type" but how can >>> 9223372036854775808 be negated in the type long long int? OK, the >>> negated value can be /represented/ in the type long long int but that's >>> not quite the same thing. On the othee hand, for the unsigned return >>> types, the negation "in the return type" is what produces ULONG_MAX for >>> "-1" when the negated value, -1, can't be /represented/ in the return >>> type. It's a case where, over the years, I've just got used to what's >>> happening. >> >> I understand what these functions do, but their specification in the >> C standard is a little off. To my way of thinking the impact is >> minimal, but the specified behavior is either unequivocally wrong or >> there are some cases that give rise to undefined behavior. > > Can you give an example where the specified behavior causes undefined > behavior? Ben gave a good answer. (My thanks to Ben for both the content and the style of his answer.)
[toc] | [prev] | [next] | [standalone]
| From | Lawrence D'Oliveiro <ldo@nz.invalid> |
|---|---|
| Date | 2024-06-24 00:48 +0000 |
| Message-ID | <v5afob$j1nj$5@dont-email.me> |
| In reply to | #386392 |
On Sun, 23 Jun 2024 16:30:13 +0100, Ben Bacarisse wrote:
> I think there /is/ something problematic with the wording about the
> negation. It happens "in the return type" but how can
> 9223372036854775808 be negated in the type long long int? OK, the
> negated value can be /represented/ in the type long long int but that's
> not quite the same thing. On the othee hand, for the unsigned return
> types, the negation "in the return type" is what produces ULONG_MAX for
> "-1" when the negated value, -1, can't be /represented/ in the return
> type. It's a case where, over the years, I've just got used to what's
> happening.
In the C23 spec, section 7.24.1.7, “The strtol, strtoll, strtoul, and
strtoull functions”, paragraph 5 begins:
If the subject sequence has the expected form and the value of
base is zero, the sequence of characters starting with the first
digit is interpreted as an integer constant according to the rules
of 6.4.4.2.
Note this is excluding any sign. So if the non-negated value cannot be
represented in the desired type, then there is no valid value to apply
negation to, so according to paragraph 8, zero is returned.
[toc] | [prev] | [next] | [standalone]
| From | James Kuyper <jameskuyper@alumni.caltech.edu> |
|---|---|
| Date | 2024-06-21 14:38 -0400 |
| Message-ID | <v54hc0$39bpi$1@dont-email.me> |
| In reply to | #386319 |
On 6/21/24 11:53, Michael S wrote: > On Fri, 21 Jun 2024 18:28:39 +0300 > Michael S <already5chosen@yahoo.com> wrote: > >> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC) >> gazelle@shell.xmission.com (Kenny McCormack) wrote: >>> >>> Yeah, now I get it. You really only need strtoimax() and >>> strtoumax(). >> >> Which are? uunfortunately, not part of C standard. They have been part of the C standard since C99. > BTW, I don't know what The Standard says about out-of-range inputs, but > at least https://en.cppreference.com/w/c/string/byte/strtol does not > say anything certain. especially about what stored in *str_end. "The strtoimax and strtoumax functions are equivalent to the strtol, strtoll, strtoul, and strtoull functions, except that the initial portion of the string is converted to intmax_t and uintmax_t representation, respectively." (7.8.2.3p2) You need to go to the descriptions of those other functions to get the detailed specifications. "If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno." As I understand it, that means that if the input string represents a value outside of the range of representable values, then strtoimax() should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and strtouimax() should return UINTMAX_MAX. Both of them should store the value of ERANGE in errno, to distinguish these results from what you would get if the string happened to represent those values. The C standard uses end_ptr rather than str_end in it's description of these functions. "... First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters, a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. ..." (7.21.4.7p2). That defines what the "final string" is. "If the subject sequence has the expected form, ... A pointer to the final string is stored in the object pointed to by endptr, provided that endptr is not a null pointer." (7.24.1.7p5). "If the subject sequence is empty or does not have the expected form ... the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer." (7.21.4.7p7) That seems very precise and unambiguous to me, aside from what "the expected form" is, which is described elsewhere.
[toc] | [prev] | [next] | [standalone]
| From | gazelle@shell.xmission.com (Kenny McCormack) |
|---|---|
| Date | 2024-06-21 18:43 +0000 |
| Message-ID | <v54hkh$2h4ra$1@news.xmission.com> |
| In reply to | #386328 |
In article <v54hc0$39bpi$1@dont-email.me>, James Kuyper <jameskuyper@alumni.caltech.edu> wrote: >On 6/21/24 11:53, Michael S wrote: >> On Fri, 21 Jun 2024 18:28:39 +0300 >> Michael S <already5chosen@yahoo.com> wrote: >> >>> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC) >>> gazelle@shell.xmission.com (Kenny McCormack) wrote: >>>> >>>> Yeah, now I get it. You really only need strtoimax() and >>>> strtoumax(). >>> >>> Which are? uunfortunately, not part of C standard. > >They have been part of the C standard since C99. To some people, "Standard C" means C89. Everything after that is, like POSIX, just fluffy nonsense. -- 12% of Americans think that Joan of Arc was Noah's wife.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-06-23 11:47 +0300 |
| Message-ID | <20240623114756.0000546a@yahoo.com> |
| In reply to | #386329 |
On Fri, 21 Jun 2024 18:43:29 -0000 (UTC) gazelle@shell.xmission.com (Kenny McCormack) wrote: > In article <v54hc0$39bpi$1@dont-email.me>, > James Kuyper <jameskuyper@alumni.caltech.edu> wrote: > >On 6/21/24 11:53, Michael S wrote: > >> On Fri, 21 Jun 2024 18:28:39 +0300 > >> Michael S <already5chosen@yahoo.com> wrote: > >> > >>> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC) > >>> gazelle@shell.xmission.com (Kenny McCormack) wrote: > >>>> > >>>> Yeah, now I get it. You really only need strtoimax() and > >>>> strtoumax(). > >>> > >>> Which are? uunfortunately, not part of C standard. > > > >They have been part of the C standard since C99. > > To some people, "Standard C" means C89. > That is not my case. I was sincerely mistaken. > Everything after that is, like POSIX, just fluffy nonsense. > I don't think that POSIX is fluffy nonsense. I do know, however, that POSIX is irrelevant for overwhelming majority of C programming that I do at work. Newer C standards are significantly more relevant, esp. language features. For library features, C Standard is relevant in a sense that if particular standard function exists in the library that I use, then it is very likely that it matches semantics of the standard.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-06-22 21:18 +0300 |
| Message-ID | <20240622211835.00004b62@yahoo.com> |
| In reply to | #386328 |
On Fri, 21 Jun 2024 14:38:56 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
> On 6/21/24 11:53, Michael S wrote:
> > On Fri, 21 Jun 2024 18:28:39 +0300
> > Michael S <already5chosen@yahoo.com> wrote:
> >
> >> On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
> >> gazelle@shell.xmission.com (Kenny McCormack) wrote:
> >>>
> >>> Yeah, now I get it. You really only need strtoimax() and
> >>> strtoumax().
> >>
> >> Which are? uunfortunately, not part of C standard.
>
> They have been part of the C standard since C99.
>
> > BTW, I don't know what The Standard says about out-of-range inputs,
> > but at least https://en.cppreference.com/w/c/string/byte/strtol
> > does not say anything certain. especially about what stored in
> > *str_end.
>
> "The strtoimax and strtoumax functions are equivalent to the strtol,
> strtoll, strtoul, and strtoull functions, except that the initial
> portion of the string is converted to intmax_t and uintmax_t
> representation, respectively." (7.8.2.3p2)
>
> You need to go to the descriptions of those other functions to get the
> detailed specifications.
>
> "If the correct value is outside the range of representable values,
> LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is
> returned (according to the return type and sign of the value, if any),
> and the value of the macro ERANGE is stored in errno."
>
> As I understand it, that means that if the input string represents a
> value outside of the range of representable values, then strtoimax()
> should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and
> strtouimax() should return UINTMAX_MAX. Both of them should store the
> value of ERANGE in errno, to distinguish these results from what you
> would get if the string happened to represent those values.
>
That what is done by my implementation, but I can not understand how it
follows from the text, esp. for a case of out of range negative input
for strtou**() functions.
That creates rather non-intuitive discontinuity.
strtoull("-18446744073709551615") => 1
strtoull("-18446744073709551616") => 18446744073709551615
>
> The C standard uses end_ptr rather than str_end in it's description of
> these functions.
>
> "... First, they decompose the input string into three parts: an
> initial, possibly empty, sequence of white-space characters, a subject
> sequence resembling an integer represented in some radix determined by
> the value of base, and a final string of one or more unrecognized
> characters, including the terminating null character of the input
> string. ..." (7.21.4.7p2).
>
> That defines what the "final string" is.
>
> "If the subject sequence has the expected form, ... A pointer to the
> final string is stored in the object pointed to by endptr, provided
> that endptr is not a null pointer." (7.24.1.7p5).
>
> "If the subject sequence is empty or does not have the expected form
> ... the value of nptr is stored in the object pointed to by endptr,
> provided that endptr is not a null pointer." (7.21.4.7p7)
>
> That seems very precise and unambiguous to me, aside from what "the
> expected form" is, which is described elsewhere.
Yes, this part of description is good and unambiguous.
I wonder why cppreference.com had chosen to use less clear wording "The
functions set the pointer pointed to by str_end to point to the
character past the last numeric character interpreted."
[toc] | [prev] | [next] | [standalone]
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
Back to top | Article view | comp.lang.c
csiph-web