Groups > comp.lang.c > #176581 > unrolled thread

Libraries using longjmp for error handling (was: Re: More on NNTP testing)

Started by	Blue-Maned_Hawk <bluemanedhawk@invalid.invalid>
First post	2023-09-27 22:52 +0000
Last post	2023-09-30 09:37 +0100
Articles	20 on this page of 62 — 12 participants

Back to article view | Back to comp.lang.c

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Libraries using longjmp for error handling (was: Re: More on NNTP testing) Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2023-09-27 22:52 +0000
    Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> - 2023-09-28 08:19 +0800
      Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 02:18 +0100
        Re: Libraries using longjmp for error handling Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2023-09-28 22:30 +0000
          Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 22:33 +0000
            Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 01:41 +0300
              Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 23:45 +0000
            Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-28 17:35 -0700
              Re: Libraries using longjmp for error handling David Brown <david.brown@hesbynett.no> - 2023-09-29 16:49 +0200
        Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 18:24 -0700
          Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 03:19 +0100
            Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 07:46 -0700
              Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 21:02 +0100
                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 21:56 -0700
                  Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 00:06 +0100
                    Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-30 20:35 -0700
                      Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-10-01 14:44 +0300
                      Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 15:45 +0100
                        Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-01 09:28 -0700
                          Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 20:49 +0100
                            Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-02 02:31 -0700
                              Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-02 23:55 +0100
                                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-03 05:01 -0700
              Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 23:58 +0300
                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-01 02:52 -0700
                  Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-10-01 14:33 +0300
    Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 01:48 +0100
    Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 05:11 +0000
      Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 06:26 +0000
    Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Anton Shepelev <anton.txt@gmail.moc> - 2023-09-28 16:02 +0300
      Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 14:44 +0100
        Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-28 18:31 +0300
          Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 16:43 +0000
            Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 17:00 +0000
              Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 17:16 +0000
              Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 19:38 +0100
            Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 16:05 -0700
              Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 00:26 +0000
                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 23:20 -0700
              Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-28 18:31 -0700
                Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 03:24 +0100
                  Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 03:12 +0000
                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 22:45 -0700
                  Re: Libraries using longjmp for error handling Spiros Bousbouras <spibou@gmail.com> - 2023-09-29 12:02 +0000
                    Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 08:10 -0700
                      Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 17:03 +0100
                        Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 10:15 -0700
                        Re: Libraries using longjmp for error handling gazelle@shell.xmission.com (Kenny McCormack) - 2023-09-29 17:17 +0000
                      Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 18:53 +0000
                    Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 18:11 +0300
                      Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 10:20 -0700
                        Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-29 11:30 -0700
                  Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 18:31 +0000
          Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 19:27 +0100
            Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 20:24 +0000
              Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 22:23 +0100
                Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 15:43 -0700
                Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 23:11 +0000
                  Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 17:23 +0300
                    Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 08:19 -0700
          Re: Libraries using longjmp for error handling David Brown <david.brown@hesbynett.no> - 2023-09-28 21:57 +0200
    Re: Libraries using longjmp for error handling Richard Kettlewell <invalid@invalid.invalid> - 2023-09-30 09:37 +0100

Page 1 of 4 [1] 2 3 4 Next page →

#176581 — Libraries using longjmp for error handling (was: Re: More on NNTP testing)

From	Blue-Maned_Hawk <bluemanedhawk@invalid.invalid>
Date	2023-09-27 22:52 +0000
Subject	Libraries using longjmp for error handling (was: Re: More on NNTP testing)
Message-ID	<pan$6f4ce$6edf891e$2b5c40c1$f8f989c1@invalid.invalid>

Richard Kettlewell wrote:

> It’s more than 20 years since I last had to integrate a C library which
> reported errors via longjmp() and I’m still bitter about it.

I have never encountered a library which does that.  Which library was 
that?

> As a matter of API design, I’d rather C library communicated errors via
> return values (and pointer parameters, where more complex error
> information is required).

Personally, i think that, at least for a library, an error should _only_ 
be communicated by return value.  If more complex information is required, 
then the return value can be made more complex.  I don't think i've ever 
used a library that communicates information via a pointer parameter.

One thing i've experienced in multiple libraries is a system a bit like 
what errno.h offers, but done via a pair of subroutines that retrieve and 
assign to some hidden global variable.  I don't like this for the same 
reason i don't like subroutines that use errno (unless they're syscall 
wrappers), but in at least two of the cases the library has also come with 
a way to set a callback subroutine to automatically deal with errors 
instead.  This is nice, since it means that the code doesn't get all 
obfuscated with error checking after every subroutine call, but it's 
annoying that each library needs to come with its own unique subroutine 
for this, and i do worry about it being overly general in treating all 
errors lethally.

-- 
Blue-Maned_HawkÃÃÃÂ¢shortens to 
HawkÃÃÃÂ¢/
blu.mÃin.dÃÃÃÃÂ°ak/
ÃÃÃÂ¢he/him/his/himself/Mr. bluemanedhawk.github.io
Warning:  Low flying owls.  Lost chihuahua.

[toc] | [next] | [standalone]

#176591

From	Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid>
Date	2023-09-28 08:19 +0800
Message-ID	<AE3RM.402236$kCld.195109@fx08.ams4>
In reply to	#176581

On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote:
> Richard Kettlewell wrote:
> 
>> It’s more than 20 years since I last had to integrate a C library which
>> reported errors via longjmp() and I’m still bitter about it.

I am not surprised.

> I have never encountered a library which does that.  Which library was
> that?
> 
>> As a matter of API design, I’d rather C library communicated errors via
>> return values (and pointer parameters, where more complex error
>> information is required).
> 
> Personally, i think that, at least for a library, an error should _only_
> be communicated by return value.  If more complex information is required,
> then the return value can be made more complex.  I don't think i've ever
> used a library that communicates information via a pointer parameter.
> 
> One thing i've experienced in multiple libraries is a system a bit like
> what errno.h offers, but done via a pair of subroutines that retrieve and
> assign to some hidden global variable.  I don't like this for the same
> reason i don't like subroutines that use errno (unless they're syscall
> wrappers), but in at least two of the cases the library has also come with
> a way to set a callback subroutine to automatically deal with errors
> instead.  This is nice, since it means that the code doesn't get all
> obfuscated with error checking after every subroutine call, but it's
> annoying that each library needs to come with its own unique subroutine
> for this, and i do worry about it being overly general in treating all
> errors lethally.

Error handling in libraries is a thorny subject, and I could go on a
long rant on it.  The short summary is simply this: error handling is
rife with incompetence, and incompetent designs.  Even in languages that
ostensibly provide better mechanisms than C, the details are usually
poorly documented and under-specified [1].

Rest assured that my error handling strategy is /sane/, though for now
I'm not explaining my code, nor design.

[1] Like in Common Lisp, when you're given a handler that supposedly
can handle it in-situ, but then the handler doesn't get enough arguments
to do much more than log a generic error, or -- this was a personal
favorite -- you have to read the source code to figure out what the
error handler gets.

-- 
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk

[toc] | [prev] | [next] | [standalone]

#176596 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-09-28 02:18 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<871qejf62x.fsf@bsb.me.uk>
In reply to	#176591

Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes:

> On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote:
>> Personally, i think that, at least for a library, an error should _only_
>> be communicated by return value.  If more complex information is required,
>> then the return value can be made more complex.
...
> Error handling in libraries is a thorny subject, and I could go on a
> long rant on it.  The short summary is simply this: error handling is
> rife with incompetence, and incompetent designs.  Even in languages that
> ostensibly provide better mechanisms than C, the details are usually
> poorly documented and under-specified [1].

If it's done well, the result is not even "error handling" -- it's just
what the function returns.  For example, a lookup in a table of integers
in Haskell returns a type that is, in effect "maybe an integer"  The
return will be either "Nothing" or something like "Just 42".

Haskell does not always get it right (particularly some of the older
APIs) but the trend is to provide return type rich enough to include
either a correct result or an explanation of the fault.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#176673 — Re: Libraries using longjmp for error handling

From	Blue-Maned_Hawk <bluemanedhawk@invalid.invalid>
Date	2023-09-28 22:30 +0000
Subject	Re: Libraries using longjmp for error handling
Message-ID	<pan$b83bb$766eb4cb$b79b4acf$fdd8021c@invalid.invalid>
In reply to	#176596

[This article is a resend because my first try seems to not have worked; 
if this is duplicated, that's the explanation.]

Ben Bacarisse wrote:

> Haskell does not always get it right (particularly some of the older
> APIs) but the trend is to provide return type rich enough to include
> either a correct result or an explanation of the fault.

Things like that are pretty much what i was referring to earlier when i
referred to making the return type more complex to handle more complex
situations.  Obviously, it would have to be done differently in C, since C 
doesn't support tagged unions (at least not natively—i know of a couple 
libraries that use macro magic to implement them).

-- 
Blue-Maned_HawkÃÃÃÂ¢shortens to 
HawkÃÃÃÂ¢/
blu.mÃin.dÃÃÃÃÂ°ak/
ÃÃÃÂ¢he/him/his/himself/Mr. bluemanedhawk.github.io
A flamethrower will not interfere with WiFi unless you aim it directly at 
the router.

[toc] | [prev] | [next] | [standalone]

#176674 — Re: Libraries using longjmp for error handling

From	scott@slp53.sl.home (Scott Lurndal)
Date	2023-09-28 22:33 +0000
Subject	Re: Libraries using longjmp for error handling
Message-ID	<PanRM.9358$Sn81.8479@fx08.iad>
In reply to	#176673

Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes:
>[This article is a resend because my first try seems to not have worked; 
>if this is duplicated, that's the explanation.]
>
>Ben Bacarisse wrote:
>
>> Haskell does not always get it right (particularly some of the older
>> APIs) but the trend is to provide return type rich enough to include
>> either a correct result or an explanation of the fault.
>
>Things like that are pretty much what i was referring to earlier when i
>referred to making the return type more complex to handle more complex
>situations.  Obviously, it would have to be done differently in C, since C 
>doesn't support tagged unions (at least not natively—i know of a couple 
>libraries that use macro magic to implement them).

In C++, a std::pair<bool, return-type> is used in that context.   If the sizeof
the return type is 64-bits or less, most modern ABI's will return it in a pair
of registers.

[toc] | [prev] | [next] | [standalone]

#176675 — Re: Libraries using longjmp for error handling

From	Anton Shepelev <anton.txt@gmail.moc>
Date	2023-09-29 01:41 +0300
Subject	Re: Libraries using longjmp for error handling
Message-ID	<20230929014147.db836fd718d5d67557fc138c@gmail.moc>
In reply to	#176674

Scott Lurndal:

> In C++, a std::pair<bool, return-type> is used in that
> context.   If the sizeof the return type is 64-bits or
> less, most modern ABI's will return it in a pair of
> registers.

This groups the error flag and return value, the error
message having to be passed somewhere else.  In C, one can
group the error flag, error code, error message plus any
additional error information, in order to return the actual
return value.

-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]

#176685 — Re: Libraries using longjmp for error handling

From	scott@slp53.sl.home (Scott Lurndal)
Date	2023-09-28 23:45 +0000
Subject	Re: Libraries using longjmp for error handling
Message-ID	<8eoRM.229644$vMO8.204792@fx16.iad>
In reply to	#176675

Anton Shepelev <anton.txt@gmail.moc> writes:
>Scott Lurndal:
>
>> In C++, a std::pair<bool, return-type> is used in that
>> context.   If the sizeof the return type is 64-bits or
>> less, most modern ABI's will return it in a pair of
>> registers.
>
>This groups the error flag and return value, the error
>message having to be passed somewhere else.  In C, one can
>group the error flag, error code, error message plus any
>additional error information, in order to return the actual
>return value.

In C++ you can group any number of objects into a struct
and return that, if you need to.  I wouldn't return a
message from the function, an error code that maps into
a message is often sufficient.

One could take a leaf from the VMS book and replace the bool
with an error code (which maps externally into a locale-specific
string), where "SS$_NORMAL" indicates that the operation was
successful, and the range less than MAX(defined errno) is
reserved for strerror() messages and the range above some
value is reserved to the application.

[toc] | [prev] | [next] | [standalone]

#176687 — Re: Libraries using longjmp for error handling

From	Keith Thompson <Keith.S.Thompson+u@gmail.com>
Date	2023-09-28 17:35 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<87edihstna.fsf@nosuchdomain.example.com>
In reply to	#176674

scott@slp53.sl.home (Scott Lurndal) writes:
> Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes:
>>Ben Bacarisse wrote:
>>> Haskell does not always get it right (particularly some of the older
>>> APIs) but the trend is to provide return type rich enough to include
>>> either a correct result or an explanation of the fault.
>>
>>Things like that are pretty much what i was referring to earlier when i
>>referred to making the return type more complex to handle more complex
>>situations.  Obviously, it would have to be done differently in C, since C 
>>doesn't support tagged unions (at least not natively—i know of a couple 
>>libraries that use macro magic to implement them).
>
> In C++, a std::pair<bool, return-type> is used in that context.   If the sizeof
> the return type is 64-bits or less, most modern ABI's will return it in a pair
> of registers.

<OT>
A std::pair<bool, return-type> gives you either a true value and a value
of the return type or a false value and a value of the return type.
There's no indication in the type itself that the second member is
meaningless if the first is false.

std::optional<return-type> is a better fit -- but it isn't available if
you have to deal with pre-C++17 compilers.
</OT>

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]

#176737 — Re: Libraries using longjmp for error handling

From	David Brown <david.brown@hesbynett.no>
Date	2023-09-29 16:49 +0200
Subject	Re: Libraries using longjmp for error handling
Message-ID	<uf6o5c$ark0$1@dont-email.me>
In reply to	#176687

On 29/09/2023 02:35, Keith Thompson wrote:
> scott@slp53.sl.home (Scott Lurndal) writes:
>> Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes:
>>> Ben Bacarisse wrote:
>>>> Haskell does not always get it right (particularly some of the older
>>>> APIs) but the trend is to provide return type rich enough to include
>>>> either a correct result or an explanation of the fault.
>>>
>>> Things like that are pretty much what i was referring to earlier when i
>>> referred to making the return type more complex to handle more complex
>>> situations.  Obviously, it would have to be done differently in C, since C
>>> doesn't support tagged unions (at least not natively—i know of a couple
>>> libraries that use macro magic to implement them).
>>
>> In C++, a std::pair<bool, return-type> is used in that context.   If the sizeof
>> the return type is 64-bits or less, most modern ABI's will return it in a pair
>> of registers.
> 
> <OT>
> A std::pair<bool, return-type> gives you either a true value and a value
> of the return type or a false value and a value of the return type.
> There's no indication in the type itself that the second member is
> meaningless if the first is false.
> 
> std::optional<return-type> is a better fit -- but it isn't available if
> you have to deal with pre-C++17 compilers.
> </OT>
> 

<OT>

std::optional<> is a standard library template that is based around a 
struct containing a bool and a return type, just like a std::pair<bool, 
return-type> would be.  I don't think it needs C++17 to implement it - I 
think you could make a reasonable bash at making an "optional" template 
even in pre-C++11 C++.  (I believe boost::optional works with C++03.)

But there are probably a number of nuances that make it better in modern 
C++, and of course it is far more convenient when it is already in the 
library.  And certainly if you are using C++17, it's a better choice 
here than a bare pair.

</OT>

[toc] | [prev] | [next] | [standalone]

#176692 — Re: Libraries using longjmp for error handling

From	Tim Rentsch <tr.17687@z991.linuxsc.com>
Date	2023-09-28 18:24 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<86y1gphiur.fsf@linuxsc.com>
In reply to	#176596

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes:
>
>> On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote:
>>
>>> Personally, i think that, at least for a library, an error
>>> should _only_ be communicated by return value.  If more complex
>>> information is required, then the return value can be made more
>>> complex.
>
> ...
>
>> Error handling in libraries is a thorny subject, and I could go
>> on a long rant on it.  The short summary is simply this:  error
>> handling is rife with incompetence, and incompetent designs.
>> Even in languages that ostensibly provide better mechanisms than
>> C, the details are usually poorly documented and under-specified
>> [1].
>
> If it's done well, the result is not even "error handling" -- it's
> just what the function returns.  For example, a lookup in a table
> of integers in Haskell returns a type that is, in effect "maybe an
> integer" The return will be either "Nothing" or something like
> "Just 42".

In languages that support it this approach is a good way to deal
with cases like this, I agree.  But to me it falls in a different
category than error handling.  An error condition is something
like an allocation (eg malloc()) failure, or a broken pipe.  Not
finding something in a table means whoever was responsible for
adding things to the table didn't add it - in other words it was
entirely a consequence of events that are under the program's
control.  The issue is how to deal with an unpredictable, and
usually very rare, event that is not something the program has
control over.  Typically these kinds of situations need a
different sort of mechanism than loading up the return value
of every function call.

[toc] | [prev] | [next] | [standalone]

#176698 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-09-29 03:19 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<877co9d8m2.fsf@bsb.me.uk>
In reply to	#176692

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes:
>>
>>> On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote:
>>>
>>>> Personally, i think that, at least for a library, an error
>>>> should _only_ be communicated by return value.  If more complex
>>>> information is required, then the return value can be made more
>>>> complex.
>>
>> ...
>>
>>> Error handling in libraries is a thorny subject, and I could go
>>> on a long rant on it.  The short summary is simply this:  error
>>> handling is rife with incompetence, and incompetent designs.
>>> Even in languages that ostensibly provide better mechanisms than
>>> C, the details are usually poorly documented and under-specified
>>> [1].
>>
>> If it's done well, the result is not even "error handling" -- it's
>> just what the function returns.  For example, a lookup in a table
>> of integers in Haskell returns a type that is, in effect "maybe an
>> integer" The return will be either "Nothing" or something like
>> "Just 42".
>
> In languages that support it this approach is a good way to deal
> with cases like this, I agree.  But to me it falls in a different
> category than error handling.  An error condition is something
> like an allocation (eg malloc()) failure, or a broken pipe.  Not
> finding something in a table means whoever was responsible for
> adding things to the table didn't add it - in other words it was
> entirely a consequence of events that are under the program's
> control.  The issue is how to deal with an unpredictable, and
> usually very rare, event that is not something the program has
> control over.  Typically these kinds of situations need a
> different sort of mechanism than loading up the return value
> of every function call.

I agree that table lookup is not a compelling example, but I don't agree
that anything different is needed.  I intended the idea to be
generalised to have return values used for all error conditions.  There
may be situations where something different is /better/, but I remain
sceptical of even this claim.  Note I am not talking about situations
where asynchronous errors need to be reported.

C uses null pointers for this all the time.  Returning a null pointer is
just a way to signal an error through the return value.  In a language
like C without null pointers I would want malloc to return a 'Maybe'
type.  And for when more information is wanted (like your example of a
broken pipe) the type returned from write should be a number of bytes or
an error value of some sort.

Obviously it's hard to recommend this in C since the language can't
really express the patterns needed to make it convenient, but I think we
are talking more generally here.

How would you prefer these sorts of thing to be signalled?

BTW, have you come across Icon?  It takes an intriguing approach where
every operation succeeds or fails as well as having a value.  This idea
is heavily built upon to produce a very comfortable scripting language.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#176735 — Re: Libraries using longjmp for error handling

From	Tim Rentsch <tr.17687@z991.linuxsc.com>
Date	2023-09-29 07:46 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<86bkdlghqr.fsf@linuxsc.com>
In reply to	#176698

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>

[..how should errors be handled..]

>>> If it's done well, the result is not even "error handling" -- it's
>>> just what the function returns.  For example, a lookup in a table
>>> of integers in Haskell returns a type that is, in effect "maybe an
>>> integer" The return will be either "Nothing" or something like
>>> "Just 42".
>>
>> In languages that support it this approach is a good way to deal
>> with cases like this, I agree.  But to me it falls in a different
>> category than error handling.  An error condition is something
>> like an allocation (eg malloc()) failure, or a broken pipe.  Not
>> finding something in a table means whoever was responsible for
>> adding things to the table didn't add it - in other words it was
>> entirely a consequence of events that are under the program's
>> control.  The issue is how to deal with an unpredictable, and
>> usually very rare, event that is not something the program has
>> control over.  Typically these kinds of situations need a
>> different sort of mechanism than loading up the return value
>> of every function call.
>
> I agree that table lookup is not a compelling example, but I don't
> agree that anything different is needed.  I intended the idea to
> be generalised to have return values used for all error
> conditions.  There may be situations where something different is
> /better/, but I remain sceptical of even this claim.  Note I am
> not talking about situations where asynchronous errors need to be
> reported.
>
> C uses null pointers for this all the time.  Returning a null
> pointer is just a way to signal an error through the return value.
> In a language like C without null pointers I would want malloc to
> return a 'Maybe' type.  And for when more information is wanted
> (like your example of a broken pipe) the type returned from write
> should be a number of bytes or an error value of some sort.
>
> Obviously it's hard to recommend this in C since the language
> can't really express the patterns needed to make it convenient,
> but I think we are talking more generally here.

I acknowledge your statement about not considering asynchronous
errors.  My own comments are likewise.

I agree that all unusual result circumstances (which might be
labeled "error conditions", without meaning to prejudice the term)
can be handled locally via either compound return values or multiple
return values (eg, by using pointer parameters), or a combination of
the two.  In languages that provide good support for compound return
values, which I would say C does not, probably a single return value
(which is perhaps a compound value) always suffices, but I haven't
thought very carefully about that.  But I think we are in general
agreement that it's always possible to handle error conditions via
local return values.

Where we might not agree is whether using direct return values is
always a good choice.  In my view, sometimes it is, sometimes it
isn't - it depends on the particular situation.  That is definitely
the case for C, but I think also for other programming languages
that have better support for multiple return values or compound
return values, or both.  And certainly if needing other mechanisms
is true for languages in general then it we would expect it is true
for C.

> How would you prefer these sorts of thing to be signalled?

Let me give just one example.

kI wrote some code not long ago to add a value to a set of values,
where the set is represented using a recursive binary tree structure
(like red-black trees, but a little different).  Usually we may
expect that a request to add a value would offer a value not already
included in the set, but certainly it can happen that there is a
call to add a value that is already included, in which case the top
level return value should be the original tree structure.

I thought it would be easier to write a simple recursive routine
that assumes the to-be-added value is not yet included in the set,
and if that assumption is violated raise an exception that is caught
at the outermost level and simply returns the original argument tree
value.  Certainly the code could have been written to handle the
already-present situation at each level of the recursion, but it was
much easier and much cleaner to handle it by raising an exception.

That example doesn't translate to C in any obvious way, because C
code to add items to a set would very likely be written rather
differently.  I think it's harder to make use of non-local returns
in C than in many other languages, because C requires more thought
and discipline to set that up.  At the same time, when I look at C
code that tries to handle all exceptional conditions locally, I
can't help but think the code would be simplified overall if code
for dealing with exceptional conditions could be removed in the code
generally and instead used a different mechanism, in fewer places,
and perhaps similar to raising an exception, to handle such cases.

> BTW, have you come across Icon?

I have, and it's a fascinating language.

> It takes an intriguing approach
> where every operation succeeds or fails as well as having a value.

I have heard Icon characterized as saying "it tries to succeed",
and I think that's a fair description.  The idea of integrating
a backtracking engine into the language semantics deserves more
attention than I think it has been given, which is unfortunate.

> This idea is heavily built upon to produce a very comfortable
> scripting language.

My experience with Icon is purely academic, which is to say I have
read about it but never used it.  I have more experience with
Prolog, which is similar in some ways (notably backtracking) but of
course very different in other ways.

I'm intrigued by your comment that Icon makes a good scripting
language.  Maybe I don't understand what you think makes a good
scripting language, or even what "scripting language" means.
That subject is not topical in comp.lang.c, but I also peruse
comp.lang.misc and comp.programming if you wouldn't mind
continuing there.  Also you are welcome to send me an email
at this address if you would rather do that.

[toc] | [prev] | [next] | [standalone]

#176759 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-09-29 21:02 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<8734ywbvek.fsf@bsb.me.uk>
In reply to	#176735

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>
>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>
>
> [..how should errors be handled..]
>
>>>> If it's done well, the result is not even "error handling" -- it's
>>>> just what the function returns.  For example, a lookup in a table
>>>> of integers in Haskell returns a type that is, in effect "maybe an
>>>> integer" The return will be either "Nothing" or something like
>>>> "Just 42".
>>>
>>> In languages that support it this approach is a good way to deal
>>> with cases like this, I agree.  But to me it falls in a different
>>> category than error handling.  An error condition is something
>>> like an allocation (eg malloc()) failure, or a broken pipe.  Not
>>> finding something in a table means whoever was responsible for
>>> adding things to the table didn't add it - in other words it was
>>> entirely a consequence of events that are under the program's
>>> control.  The issue is how to deal with an unpredictable, and
>>> usually very rare, event that is not something the program has
>>> control over.  Typically these kinds of situations need a
>>> different sort of mechanism than loading up the return value
>>> of every function call.
>>
>> I agree that table lookup is not a compelling example, but I don't
>> agree that anything different is needed.  I intended the idea to
>> be generalised to have return values used for all error
>> conditions.  There may be situations where something different is
>> /better/, but I remain sceptical of even this claim.  Note I am
>> not talking about situations where asynchronous errors need to be
>> reported.
>>
>> C uses null pointers for this all the time.  Returning a null
>> pointer is just a way to signal an error through the return value.
>> In a language like C without null pointers I would want malloc to
>> return a 'Maybe' type.  And for when more information is wanted
>> (like your example of a broken pipe) the type returned from write
>> should be a number of bytes or an error value of some sort.
>>
>> Obviously it's hard to recommend this in C since the language
>> can't really express the patterns needed to make it convenient,
>> but I think we are talking more generally here.
>
> I acknowledge your statement about not considering asynchronous
> errors.  My own comments are likewise.
>
> I agree that all unusual result circumstances (which might be
> labeled "error conditions", without meaning to prejudice the term)
> can be handled locally via either compound return values or multiple
> return values (eg, by using pointer parameters), or a combination of
> the two.  In languages that provide good support for compound return
> values, which I would say C does not, probably a single return value
> (which is perhaps a compound value) always suffices, but I haven't
> thought very carefully about that.  But I think we are in general
> agreement that it's always possible to handle error conditions via
> local return values.
>
> Where we might not agree is whether using direct return values is
> always a good choice.  In my view, sometimes it is, sometimes it
> isn't - it depends on the particular situation.  That is definitely
> the case for C, but I think also for other programming languages
> that have better support for multiple return values or compound
> return values, or both.  And certainly if needing other mechanisms
> is true for languages in general then it we would expect it is true
> for C.
>
>> How would you prefer these sorts of thing to be signalled?
>
> Let me give just one example.
>
> kI wrote some code not long ago to add a value to a set of values,
> where the set is represented using a recursive binary tree structure
> (like red-black trees, but a little different).  Usually we may
> expect that a request to add a value would offer a value not already
> included in the set, but certainly it can happen that there is a
> call to add a value that is already included, in which case the top
> level return value should be the original tree structure.
>
> I thought it would be easier to write a simple recursive routine
> that assumes the to-be-added value is not yet included in the set,
> and if that assumption is violated raise an exception that is caught
> at the outermost level and simply returns the original argument tree
> value.  Certainly the code could have been written to handle the
> already-present situation at each level of the recursion, but it was
> much easier and much cleaner to handle it by raising an exception.

This is one of those cases where I just have to take your word for it,
(and I am happy to do that).

> That example doesn't translate to C in any obvious way, because C
> code to add items to a set would very likely be written rather
> differently.  I think it's harder to make use of non-local returns
> in C than in many other languages, because C requires more thought
> and discipline to set that up.  At the same time, when I look at C
> code that tries to handle all exceptional conditions locally, I
> can't help but think the code would be simplified overall if code
> for dealing with exceptional conditions could be removed in the code
> generally and instead used a different mechanism, in fewer places,
> and perhaps similar to raising an exception, to handle such cases.

You may be right, but I don't like exceptions.  More specifically, I
don't like exceptions that are not handled internally in some API or
other (so your tree example sounds like a sane use for them), but I
worry about their wider use.  You either have to wrap every call in a
"try" (in which case why not use the multiple value return method) or
the reasoning about correctness starts to get more and more
complicated.

>> BTW, have you come across Icon?
>
> I have, and it's a fascinating language.
>
>> It takes an intriguing approach
>> where every operation succeeds or fails as well as having a value.
>
> I have heard Icon characterized as saying "it tries to succeed",
> and I think that's a fair description.  The idea of integrating
> a backtracking engine into the language semantics deserves more
> attention than I think it has been given, which is unfortunate.
>
>> This idea is heavily built upon to produce a very comfortable
>> scripting language.
>
> My experience with Icon is purely academic, which is to say I have
> read about it but never used it.  I have more experience with
> Prolog, which is similar in some ways (notably backtracking) but of
> course very different in other ways.
>
> I'm intrigued by your comment that Icon makes a good scripting
> language.  Maybe I don't understand what you think makes a good
> scripting language, or even what "scripting language" means.
> That subject is not topical in comp.lang.c, but I also peruse
> comp.lang.misc and comp.programming if you wouldn't mind
> continuing there.  Also you are welcome to send me an email
> at this address if you would rather do that.

It was a careless remark with no significant meaning.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#176778 — Re: Libraries using longjmp for error handling

From	Tim Rentsch <tr.17687@z991.linuxsc.com>
Date	2023-09-29 21:56 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<86cyy0fee5.fsf@linuxsc.com>
In reply to	#176759

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>> [..how should errors be handled..]
>>
>>> How would you prefer these sorts of thing to be signalled?
>>
>> Let me give just one example.
>>
>> I wrote some code not long ago to add a value to a set of values,
>> where the set is represented using a recursive binary tree structure
>> (like red-black trees, but a little different).  Usually we may
>> expect that a request to add a value would offer a value not already
>> included in the set, but certainly it can happen that there is a
>> call to add a value that is already included, in which case the top
>> level return value should be the original tree structure.
>>
>> I thought it would be easier to write a simple recursive routine
>> that assumes the to-be-added value is not yet included in the set,
>> and if that assumption is violated raise an exception that is caught
>> at the outermost level and simply returns the original argument tree
>> value.  Certainly the code could have been written to handle the
>> already-present situation at each level of the recursion, but it was
>> much easier and much cleaner to handle it by raising an exception.
>
> This is one of those cases where I just have to take your word for it,
> (and I am happy to do that).

Responding here to just this one part.

Here is a simpler version of the code, for an ordinary
binary tree, and without any rebalancing:

  module TreeSet = struct
    type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
    exception Present

    let add tree element =
      let rec add t e =
        match t with
        | Empty -> Node( Empty, e, Empty )
        | Node( a, v, b ) ->
            if  e < v  then  Node( add a e, v, b       )  else
            if  e > v  then  Node( a,       v, add b e )  else
            raise Present
      in
      try  add tree element  with  Present -> tree

  end

Transliterating that into C, the central functions might look like
this (I have left out the type definition and the constructor
function used to build a new node value):

  static Tree add( jmp_buf, Tree, double );
  static Tree raise_present( jmp_buf );
  static Tree new_node( Tree, double, Tree );

  Tree
  add_element( Tree tree, double new_element ){
    jmp_buf jb;
    if(  setjmp( jb )  )  return  tree;

    return  add( jb, tree, new_element );
  }

  Tree
  add( jmp_buf jb, Tree t, double d ){
    return
	! t	  ?  new_node( 0, d, 0 )
    :   d < t->v  ?  new_node( add( jb, t->a, d ), t->v,          t->b      )
    :   d > t->v  ?  new_node(          t->a,      t->v, add( jb, t->b, d ) )
    :	/*********/  raise_present( jb );
  }

  Tree
  raise_present( jmp_buf jb ){
    longjmp( jb, 1 );
  }

Of course it's unlikely that sets would be implemented this way
in C, but that's not the point;  the point is that exceptions
can result in code that is cleaner and easier to understand
than an alternate approach where return values would have to
be checked at every level.

(Incidentally, I want to acknowlege you comment about taking my
word for the result.  I still thought it would be good to give
a concrete example, even if it is a very simplified one.)

[toc] | [prev] | [next] | [standalone]

#176842 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-10-01 00:06 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<87ttrbclcj.fsf@bsb.me.uk>
In reply to	#176778

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>
>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>
>>> [..how should errors be handled..]
>>>
>>>> How would you prefer these sorts of thing to be signalled?
>>>
>>> Let me give just one example.
>>>
>>> I wrote some code not long ago to add a value to a set of values,
>>> where the set is represented using a recursive binary tree structure
>>> (like red-black trees, but a little different).  Usually we may
>>> expect that a request to add a value would offer a value not already
>>> included in the set, but certainly it can happen that there is a
>>> call to add a value that is already included, in which case the top
>>> level return value should be the original tree structure.
>>>
>>> I thought it would be easier to write a simple recursive routine
>>> that assumes the to-be-added value is not yet included in the set,
>>> and if that assumption is violated raise an exception that is caught
>>> at the outermost level and simply returns the original argument tree
>>> value.  Certainly the code could have been written to handle the
>>> already-present situation at each level of the recursion, but it was
>>> much easier and much cleaner to handle it by raising an exception.
>>
>> This is one of those cases where I just have to take your word for it,
>> (and I am happy to do that).
>
> Responding here to just this one part.
>
> Here is a simpler version of the code, for an ordinary
> binary tree, and without any rebalancing:
>
>   module TreeSet = struct
>     type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>     exception Present
>
>     let add tree element =
>       let rec add t e =
>         match t with
>         | Empty -> Node( Empty, e, Empty )
>         | Node( a, v, b ) ->
>             if  e < v  then  Node( add a e, v, b       )  else
>             if  e > v  then  Node( a,       v, add b e )  else
>             raise Present
>       in
>       try  add tree element  with  Present -> tree
>
>   end

I'm not getting it.  I'd write 'else t' rather than 'else raise Present'
and I could then do away with the wrapping 'add' function.

I worry we are getting away from C, but the point is likely to be much
simpler to make in OCaml (or Caml).

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#176850 — Re: Libraries using longjmp for error handling

From	Tim Rentsch <tr.17687@z991.linuxsc.com>
Date	2023-09-30 20:35 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<86r0mfdng1.fsf@linuxsc.com>
In reply to	#176842

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>
>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>
>>>> [..how should errors be handled..]
>>>>
>>>>> How would you prefer these sorts of thing to be signalled?
>>>>
>>>> Let me give just one example.
>>>>
>>>> I wrote some code not long ago to add a value to a set of values,
>>>> where the set is represented using a recursive binary tree structure
>>>> (like red-black trees, but a little different).  Usually we may
>>>> expect that a request to add a value would offer a value not already
>>>> included in the set, but certainly it can happen that there is a
>>>> call to add a value that is already included, in which case the top
>>>> level return value should be the original tree structure.
>>>>
>>>> I thought it would be easier to write a simple recursive routine
>>>> that assumes the to-be-added value is not yet included in the set,
>>>> and if that assumption is violated raise an exception that is caught
>>>> at the outermost level and simply returns the original argument tree
>>>> value.  Certainly the code could have been written to handle the
>>>> already-present situation at each level of the recursion, but it was
>>>> much easier and much cleaner to handle it by raising an exception.
>>>
>>> This is one of those cases where I just have to take your word for it,
>>> (and I am happy to do that).
>>
>> Responding here to just this one part.
>>
>> Here is a simpler version of the code, for an ordinary
>> binary tree, and without any rebalancing:
>>
>>   module TreeSet = struct
>>     type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>     exception Present
>>
>>     let add tree element =
>>       let rec add t e =
>>         match t with
>>         | Empty -> Node( Empty, e, Empty )
>>         | Node( a, v, b ) ->
>>             if  e < v  then  Node( add a e, v, b       )  else
>>             if  e > v  then  Node( a,       v, add b e )  else
>>             raise Present
>>       in
>>       try  add tree element  with  Present -> tree
>>
>>   end
>
> I'm not getting it.  I'd write 'else t' rather than 'else raise
> Present' and I could then do away with the wrapping 'add' function.

Doing that would result in returning a valid tree, but it would also
needlessly replicate all nodes starting from the point where the
match was found and going up to the root.  The consequence is more
cycles used and more demand on the memory allocator.  The technique
of raising an exception avoids those shortcomings (and also it
preserves tree identity, which can be important in some situations).

> I worry we are getting away from C, but the point is likely to be
> much simpler to make in OCaml (or Caml).

I have another example.  The example doesn't include any C code, but
I will talk about some concerns that pertain to C.

Suppose we are writing a compiler for a large programming language.
The compiler should make use of available resources, especially
memory resources, when it can, but also should be able to compile
large programs even when the amount of memory available is much more
limited.  How much memory is available is not known in advance;  we
find out when a call to malloc() or realloc() returns NULL.  I note
in passing that the PL/I(F) compiler from IBM could translate full
PL/I even if limited to only 44 K bytes of user memory;  to do that
it needed 110 physical passes over source text or structures created
and stored in disk files.

The idea in essence is to have two compilers in one:  one taking a
conventional approach using regular dynamic memory allocation (in
other words malloc() etc), and the other using an IBM-style multiple
passes scheme that stores intermediate data in disk files.  To
compile a program we start the first compiler under an umbrella of
setjmp(), which going forward assumes all memory allocations will
succeed.  To allocate memory we call wrapper functions for malloc()
or realloc(), which will do a longjmp() if an allocation fails.
Upon receipt of the longjmp() "exception" the outer setjmp() call
starts over using the second approach, which is of course much
slower but also much less demanding of main memory.

To make this work we might have to track malloc()'s and free()'s so
that any not-yet-released memory can be reclaimed before beginning
the alternate system, but I think that is straightforward and not in
need of any further explanation.

I admit this example isn't very compelling given that even small
computers today have multiple gigabytes of RAM.  But I think it
isn't hard to imagine an analogous problem in a more limited
environment, for example running in a VM on a co-located server.
(I have run out of "RAM" when running a program on my colo server
here.  It was quite confusing when it first happened.)

[toc] | [prev] | [next] | [standalone]

#176863 — Re: Libraries using longjmp for error handling

From	Anton Shepelev <anton.txt@gmail.moc>
Date	2023-10-01 14:44 +0300
Subject	Re: Libraries using longjmp for error handling
Message-ID	<20231001144405.3f6863a9af57ee632caaf74d@gmail.moc>
In reply to	#176850

Tim Rentsch:

> The idea in essence is to have two compilers in one:  one
> taking a conventional approach using regular dynamic
> memory allocation (in other words malloc() etc), and the
> other using an IBM-style multiple passes scheme that
> stores intermediate data in disk files.  To compile a
> program we start the first compiler under an umbrella of
> setjmp(), which going forward assumes all memory
> allocations will succeed.  To allocate memory we call
> wrapper functions for malloc() or realloc(), which will do
> a longjmp() if an allocation fails.  Upon receipt of the
> longjmp() "exception" the outer setjmp() call starts over
> using the second approach, which is of course much slower
> but also much less demanding of main memory.
> [...]
> I admit this example isn't very compelling given that even
> small computers today have multiple gigabytes of RAM.

No, the availablity of lots of RAM in modern computers in
mere incidental to your example and in no way affects it.

How about this example: a recursive-descent parser for a
deeply nested grammar.  How does one handle and reort an
error encountered ten invocations underground?  Is it a
simpler example of the same problem?

-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]

#176866 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-10-01 15:45 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<87il7qcsfz.fsf@bsb.me.uk>
In reply to	#176850

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>
>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>
>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>
>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>
>>>>> [..how should errors be handled..]
>>>>>
>>>>>> How would you prefer these sorts of thing to be signalled?
>>>>>
>>>>> Let me give just one example.
>>>>>
>>>>> I wrote some code not long ago to add a value to a set of values,
>>>>> where the set is represented using a recursive binary tree structure
>>>>> (like red-black trees, but a little different).  Usually we may
>>>>> expect that a request to add a value would offer a value not already
>>>>> included in the set, but certainly it can happen that there is a
>>>>> call to add a value that is already included, in which case the top
>>>>> level return value should be the original tree structure.
>>>>>
>>>>> I thought it would be easier to write a simple recursive routine
>>>>> that assumes the to-be-added value is not yet included in the set,
>>>>> and if that assumption is violated raise an exception that is caught
>>>>> at the outermost level and simply returns the original argument tree
>>>>> value.  Certainly the code could have been written to handle the
>>>>> already-present situation at each level of the recursion, but it was
>>>>> much easier and much cleaner to handle it by raising an exception.
>>>>
>>>> This is one of those cases where I just have to take your word for it,
>>>> (and I am happy to do that).
>>>
>>> Responding here to just this one part.
>>>
>>> Here is a simpler version of the code, for an ordinary
>>> binary tree, and without any rebalancing:
>>>
>>>   module TreeSet = struct
>>>     type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>>     exception Present
>>>
>>>     let add tree element =
>>>       let rec add t e =
>>>         match t with
>>>         | Empty -> Node( Empty, e, Empty )
>>>         | Node( a, v, b ) ->
>>>             if  e < v  then  Node( add a e, v, b       )  else
>>>             if  e > v  then  Node( a,       v, add b e )  else
>>>             raise Present
>>>       in
>>>       try  add tree element  with  Present -> tree
>>>
>>>   end
>>
>> I'm not getting it.  I'd write 'else t' rather than 'else raise
>> Present' and I could then do away with the wrapping 'add' function.
>
> Doing that would result in returning a valid tree, but it would also
> needlessly replicate all nodes starting from the point where the
> match was found and going up to the root.  The consequence is more
> cycles used and more demand on the memory allocator.

How can OCaml avoid making the allocations until it knows the exception
won't be raised?  Are they not all still being done?

> The technique
> of raising an exception avoids those shortcomings (and also it
> preserves tree identity, which can be important in some situations).

I agree about the identity.  I am not used to relying on the identity in
situations like this (probably because of Haskell!) but I might find
some time to work out how to preserve identity without an exception.  I
fear it will be messier that your simple solution.

>> I worry we are getting away from C, but the point is likely to be
>> much simpler to make in OCaml (or Caml).
>
> I have another example.  The example doesn't include any C code, but
> I will talk about some concerns that pertain to C.
>
> Suppose we are writing a compiler for a large programming language.
> The compiler should make use of available resources, especially
> memory resources, when it can, but also should be able to compile
> large programs even when the amount of memory available is much more
> limited.  How much memory is available is not known in advance;  we
> find out when a call to malloc() or realloc() returns NULL.  I note
> in passing that the PL/I(F) compiler from IBM could translate full
> PL/I even if limited to only 44 K bytes of user memory;  to do that
> it needed 110 physical passes over source text or structures created
> and stored in disk files.
>
> The idea in essence is to have two compilers in one:  one taking a
> conventional approach using regular dynamic memory allocation (in
> other words malloc() etc), and the other using an IBM-style multiple
> passes scheme that stores intermediate data in disk files.  To
> compile a program we start the first compiler under an umbrella of
> setjmp(), which going forward assumes all memory allocations will
> succeed.  To allocate memory we call wrapper functions for malloc()
> or realloc(), which will do a longjmp() if an allocation fails.
> Upon receipt of the longjmp() "exception" the outer setjmp() call
> starts over using the second approach, which is of course much
> slower but also much less demanding of main memory.
>
> To make this work we might have to track malloc()'s and free()'s so
> that any not-yet-released memory can be reclaimed before beginning
> the alternate system, but I think that is straightforward and not in
> need of any further explanation.

I am not convinced it's neater to use longjmp here.  I prefer to write
code where allocation failures just propagate up the call stack.  That
would give me a top-level failure return from the try_fast_compile()
function call.

> I admit this example isn't very compelling given that even small
> computers today have multiple gigabytes of RAM.  But I think it
> isn't hard to imagine an analogous problem in a more limited
> environment, for example running in a VM on a co-located server.
> (I have run out of "RAM" when running a program on my colo server
> here.  It was quite confusing when it first happened.)

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

#176876 — Re: Libraries using longjmp for error handling

From	Tim Rentsch <tr.17687@z991.linuxsc.com>
Date	2023-10-01 09:28 -0700
Subject	Re: Libraries using longjmp for error handling
Message-ID	<86ediee28o.fsf@linuxsc.com>
In reply to	#176866

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>
>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>
>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>>
>>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>>
>>>>>> [..how should errors be handled..]
>>>>>>
>>>>>>> How would you prefer these sorts of thing to be signalled?
>>>>>>
>>>>>> Let me give just one example.
>>>>>>
>>>>>> I wrote some code not long ago to add a value to a set of values,
>>>>>> where the set is represented using a recursive binary tree structure
>>>>>> (like red-black trees, but a little different).  Usually we may
>>>>>> expect that a request to add a value would offer a value not already
>>>>>> included in the set, but certainly it can happen that there is a
>>>>>> call to add a value that is already included, in which case the top
>>>>>> level return value should be the original tree structure.
>>>>>>
>>>>>> I thought it would be easier to write a simple recursive routine
>>>>>> that assumes the to-be-added value is not yet included in the set,
>>>>>> and if that assumption is violated raise an exception that is caught
>>>>>> at the outermost level and simply returns the original argument tree
>>>>>> value.  Certainly the code could have been written to handle the
>>>>>> already-present situation at each level of the recursion, but it was
>>>>>> much easier and much cleaner to handle it by raising an exception.
>>>>>
>>>>> This is one of those cases where I just have to take your word for it,
>>>>> (and I am happy to do that).
>>>>
>>>> Responding here to just this one part.
>>>>
>>>> Here is a simpler version of the code, for an ordinary
>>>> binary tree, and without any rebalancing:
>>>>
>>>>   module TreeSet = struct
>>>>     type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>>>     exception Present
>>>>
>>>>     let add tree element =
>>>>       let rec add t e =
>>>>         match t with
>>>>         | Empty -> Node( Empty, e, Empty )
>>>>         | Node( a, v, b ) ->
>>>>             if  e < v  then  Node( add a e, v, b       )  else
>>>>             if  e > v  then  Node( a,       v, add b e )  else
>>>>             raise Present
>>>>       in
>>>>       try  add tree element  with  Present -> tree
>>>>
>>>>   end
>>>
>>> I'm not getting it.  I'd write 'else t' rather than 'else raise
>>> Present' and I could then do away with the wrapping 'add' function.
>>
>> Doing that would result in returning a valid tree, but it would also
>> needlessly replicate all nodes starting from the point where the
>> match was found and going up to the root.  The consequence is more
>> cycles used and more demand on the memory allocator.
>
> How can OCaml avoid making the allocations until it knows the exception
> won't be raised?  Are they not all still being done?

No, no allocations are done if the to-be-added element is found
(and so an exception is raised).  Consider one of the lines where a
Node is being constructed (which is synonymous with an allocation
occurring):

     if  e < v  then  Node( add a e, v, b       )  else

The call to the Node() constructor doesn't take place until the
recursive call to 'add' returns.  But if an equal value is found,
then an exception is raised, and the call to 'add' does not /ever/
return.  No return from 'add' means no Node() is constructed.

To say this another way, looking for a point where an insertion
should take place, or where there is an equal value, happens going
/down/ the call chain.  Node() allocations, however, happen only
when we are coming back /up/ the call chain.  Raising an exception
"short circuits" all of those pending returns:  they never happen,
and so no Node()s are constructed.


>> The technique
>> of raising an exception avoids those shortcomings (and also it
>> preserves tree identity, which can be important in some situations).
>
> I agree about the identity.  I am not used to relying on the identity in
> situations like this (probably because of Haskell!) but I might find
> some time to work out how to preserve identity without an exception.  I
> fear it will be messier that your simple solution.

SPOILER ALERT - at the end of this posting I am including code
for a version of add (called add') that does not use exceptions
and preserves identity when the "new" element is already present.

>>> I worry we are getting away from C, but the point is likely to be
>>> much simpler to make in OCaml (or Caml).
>>
>> I have another example.  The example doesn't include any C code, but
>> I will talk about some concerns that pertain to C.
>>
>> Suppose we are writing a compiler for a large programming language.
>> The compiler should make use of available resources, especially
>> memory resources, when it can, but also should be able to compile
>> large programs even when the amount of memory available is much more
>> limited.  How much memory is available is not known in advance;  we
>> find out when a call to malloc() or realloc() returns NULL.  I note
>> in passing that the PL/I(F) compiler from IBM could translate full
>> PL/I even if limited to only 44 K bytes of user memory;  to do that
>> it needed 110 physical passes over source text or structures created
>> and stored in disk files.
>>
>> The idea in essence is to have two compilers in one:  one taking a
>> conventional approach using regular dynamic memory allocation (in
>> other words malloc() etc), and the other using an IBM-style multiple
>> passes scheme that stores intermediate data in disk files.  To
>> compile a program we start the first compiler under an umbrella of
>> setjmp(), which going forward assumes all memory allocations will
>> succeed.  To allocate memory we call wrapper functions for malloc()
>> or realloc(), which will do a longjmp() if an allocation fails.
>> Upon receipt of the longjmp() "exception" the outer setjmp() call
>> starts over using the second approach, which is of course much
>> slower but also much less demanding of main memory.
>>
>> To make this work we might have to track malloc()'s and free()'s so
>> that any not-yet-released memory can be reclaimed before beginning
>> the alternate system, but I think that is straightforward and not in
>> need of any further explanation.
>
> I am not convinced it's neater to use longjmp here.  I prefer to write
> code where allocation failures just propagate up the call stack.  That
> would give me a top-level failure return from the try_fast_compile()
> function call.

There was a time when I would normally handle unusual conditions
locally, and pooh-poohed using exceptions (and setjmp()/longjmp()).
Working in OCaml has caused me (I speculate) to revise my thoughts
on the question.

SPOILER ALERT - continue scrolling only when ready to see the code
for an exception-free version of the add function...








































Here it is...

  let add' tree element =
    let rec add t e =
      match t with
      | Empty -> Some (Node( Empty, e, Empty ))
      | Node( a, v, b ) -> 
        if  e < v  then (
            match  add a e  with
            | None -> None
            | Some a' -> Some( Node( a', v, b ) )
        ) else if e > v  then (
            match  add b e  with
            | None -> None
            | Some b' -> Some( Node( a, v, b' ) )
        ) else None
  in
  match add tree element with
  | Some t -> t
  | None   -> tree


The original version is 10 lines, the revised version 18 lines.

That ratio is higher than what I would expect for an analogous
change to the full version (which is much longer than the simple
non-rebalancing version, because it has more cases to deal with,
and needs to keep the tree balanced), but not a lot higher - maybe
only 40 or 50% longer instead of 80% longer.

[toc] | [prev] | [next] | [standalone]

#176889 — Re: Libraries using longjmp for error handling

From	Ben Bacarisse <ben.usenet@bsb.me.uk>
Date	2023-10-01 20:49 +0100
Subject	Re: Libraries using longjmp for error handling
Message-ID	<874jjaced5.fsf@bsb.me.uk>
In reply to	#176876

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>
>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>
>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>
>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>
>>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>>>
>>>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>>>
>>>>>>> [..how should errors be handled..]
>>>>>>>
>>>>>>>> How would you prefer these sorts of thing to be signalled?
>>>>>>>
>>>>>>> Let me give just one example.
>>>>>>>
>>>>>>> I wrote some code not long ago to add a value to a set of values,
>>>>>>> where the set is represented using a recursive binary tree structure
>>>>>>> (like red-black trees, but a little different).  Usually we may
>>>>>>> expect that a request to add a value would offer a value not already
>>>>>>> included in the set, but certainly it can happen that there is a
>>>>>>> call to add a value that is already included, in which case the top
>>>>>>> level return value should be the original tree structure.
>>>>>>>
>>>>>>> I thought it would be easier to write a simple recursive routine
>>>>>>> that assumes the to-be-added value is not yet included in the set,
>>>>>>> and if that assumption is violated raise an exception that is caught
>>>>>>> at the outermost level and simply returns the original argument tree
>>>>>>> value.  Certainly the code could have been written to handle the
>>>>>>> already-present situation at each level of the recursion, but it was
>>>>>>> much easier and much cleaner to handle it by raising an exception.
>>>>>>
>>>>>> This is one of those cases where I just have to take your word for it,
>>>>>> (and I am happy to do that).
>>>>>
>>>>> Responding here to just this one part.
>>>>>
>>>>> Here is a simpler version of the code, for an ordinary
>>>>> binary tree, and without any rebalancing:
>>>>>
>>>>>   module TreeSet = struct
>>>>>     type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>>>>     exception Present
>>>>>
>>>>>     let add tree element =
>>>>>       let rec add t e =
>>>>>         match t with
>>>>>         | Empty -> Node( Empty, e, Empty )
>>>>>         | Node( a, v, b ) ->
>>>>>             if  e < v  then  Node( add a e, v, b       )  else
>>>>>             if  e > v  then  Node( a,       v, add b e )  else
>>>>>             raise Present
>>>>>       in
>>>>>       try  add tree element  with  Present -> tree
>>>>>
>>>>>   end
>>>>
>>>> I'm not getting it.  I'd write 'else t' rather than 'else raise
>>>> Present' and I could then do away with the wrapping 'add' function.
>>>
>>> Doing that would result in returning a valid tree, but it would also
>>> needlessly replicate all nodes starting from the point where the
>>> match was found and going up to the root.  The consequence is more
>>> cycles used and more demand on the memory allocator.
>>
>> How can OCaml avoid making the allocations until it knows the exception
>> won't be raised?  Are they not all still being done?
>
> No, no allocations are done if the to-be-added element is found
> (and so an exception is raised).  Consider one of the lines where a
> Node is being constructed (which is synonymous with an allocation
> occurring):
>
>      if  e < v  then  Node( add a e, v, b       )  else
>
> The call to the Node() constructor doesn't take place until the
> recursive call to 'add' returns.  But if an equal value is found,
> then an exception is raised, and the call to 'add' does not /ever/
> return.  No return from 'add' means no Node() is constructed.

Ah, yes.  I was thinking in a lazy language (Haskell) whilst looking at
code in a strict one.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]

Page 1 of 4 [1] 2 3 4 Next page →

csiph-web

Libraries using longjmp for error handling (was: Re: More on NNTP testing)

Contents

#176581 — Libraries using longjmp for error handling (was: Re: More on NNTP testing)

#176591

#176596 — Re: Libraries using longjmp for error handling

#176673 — Re: Libraries using longjmp for error handling

#176674 — Re: Libraries using longjmp for error handling

#176675 — Re: Libraries using longjmp for error handling

#176685 — Re: Libraries using longjmp for error handling

#176687 — Re: Libraries using longjmp for error handling

#176737 — Re: Libraries using longjmp for error handling

#176692 — Re: Libraries using longjmp for error handling

#176698 — Re: Libraries using longjmp for error handling

#176735 — Re: Libraries using longjmp for error handling

#176759 — Re: Libraries using longjmp for error handling

#176778 — Re: Libraries using longjmp for error handling

#176842 — Re: Libraries using longjmp for error handling

#176850 — Re: Libraries using longjmp for error handling

#176863 — Re: Libraries using longjmp for error handling

#176866 — Re: Libraries using longjmp for error handling

#176876 — Re: Libraries using longjmp for error handling

#176889 — Re: Libraries using longjmp for error handling