Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #176581 > unrolled thread
| Started by | Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> |
|---|---|
| First post | 2023-09-27 22:52 +0000 |
| Last post | 2023-09-30 09:37 +0100 |
| Articles | 20 on this page of 62 — 12 participants |
Back to article view | Back to comp.lang.c
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Libraries using longjmp for error handling (was: Re: More on NNTP testing) Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2023-09-27 22:52 +0000
Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> - 2023-09-28 08:19 +0800
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 02:18 +0100
Re: Libraries using longjmp for error handling Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2023-09-28 22:30 +0000
Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 22:33 +0000
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 01:41 +0300
Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 23:45 +0000
Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-28 17:35 -0700
Re: Libraries using longjmp for error handling David Brown <david.brown@hesbynett.no> - 2023-09-29 16:49 +0200
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 18:24 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 03:19 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 07:46 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 21:02 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 21:56 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 00:06 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-30 20:35 -0700
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-10-01 14:44 +0300
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 15:45 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-01 09:28 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-01 20:49 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-02 02:31 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-10-02 23:55 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-03 05:01 -0700
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 23:58 +0300
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-10-01 02:52 -0700
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-10-01 14:33 +0300
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 01:48 +0100
Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 05:11 +0000
Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 06:26 +0000
Re: Libraries using longjmp for error handling (was: Re: More on NNTP testing) Anton Shepelev <anton.txt@gmail.moc> - 2023-09-28 16:02 +0300
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 14:44 +0100
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-28 18:31 +0300
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 16:43 +0000
Re: Libraries using longjmp for error handling scott@slp53.sl.home (Scott Lurndal) - 2023-09-28 17:00 +0000
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 17:16 +0000
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 19:38 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 16:05 -0700
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 00:26 +0000
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 23:20 -0700
Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-28 18:31 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 03:24 +0100
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 03:12 +0000
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 22:45 -0700
Re: Libraries using longjmp for error handling Spiros Bousbouras <spibou@gmail.com> - 2023-09-29 12:02 +0000
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 08:10 -0700
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-29 17:03 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 10:15 -0700
Re: Libraries using longjmp for error handling gazelle@shell.xmission.com (Kenny McCormack) - 2023-09-29 17:17 +0000
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 18:53 +0000
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 18:11 +0300
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 10:20 -0700
Re: Libraries using longjmp for error handling Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-09-29 11:30 -0700
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-29 18:31 +0000
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 19:27 +0100
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 20:24 +0000
Re: Libraries using longjmp for error handling Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-28 22:23 +0100
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-28 15:43 -0700
Re: Libraries using longjmp for error handling Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-28 23:11 +0000
Re: Libraries using longjmp for error handling Anton Shepelev <anton.txt@gmail.moc> - 2023-09-29 17:23 +0300
Re: Libraries using longjmp for error handling Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-29 08:19 -0700
Re: Libraries using longjmp for error handling David Brown <david.brown@hesbynett.no> - 2023-09-28 21:57 +0200
Re: Libraries using longjmp for error handling Richard Kettlewell <invalid@invalid.invalid> - 2023-09-30 09:37 +0100
Page 1 of 4 [1] 2 3 4 Next page →
| From | Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> |
|---|---|
| Date | 2023-09-27 22:52 +0000 |
| Subject | Libraries using longjmp for error handling (was: Re: More on NNTP testing) |
| Message-ID | <pan$6f4ce$6edf891e$2b5c40c1$f8f989c1@invalid.invalid> |
Richard Kettlewell wrote: > It’s more than 20 years since I last had to integrate a C library which > reported errors via longjmp() and I’m still bitter about it. I have never encountered a library which does that. Which library was that? > As a matter of API design, I’d rather C library communicated errors via > return values (and pointer parameters, where more complex error > information is required). Personally, i think that, at least for a library, an error should _only_ be communicated by return value. If more complex information is required, then the return value can be made more complex. I don't think i've ever used a library that communicates information via a pointer parameter. One thing i've experienced in multiple libraries is a system a bit like what errno.h offers, but done via a pair of subroutines that retrieve and assign to some hidden global variable. I don't like this for the same reason i don't like subroutines that use errno (unless they're syscall wrappers), but in at least two of the cases the library has also come with a way to set a callback subroutine to automatically deal with errors instead. This is nice, since it means that the code doesn't get all obfuscated with error checking after every subroutine call, but it's annoying that each library needs to come with its own unique subroutine for this, and i do worry about it being overly general in treating all errors lethally. -- Blue-Maned_HawkÃÃâshortens to HawkÃÃâ/ blu.mÃin.dÃÃÃðak/ ÃÃâhe/him/his/himself/Mr. bluemanedhawk.github.io Warning: Low flying owls. Lost chihuahua.
[toc] | [next] | [standalone]
| From | Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> |
|---|---|
| Date | 2023-09-28 08:19 +0800 |
| Message-ID | <AE3RM.402236$kCld.195109@fx08.ams4> |
| In reply to | #176581 |
On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote: > Richard Kettlewell wrote: > >> It’s more than 20 years since I last had to integrate a C library which >> reported errors via longjmp() and I’m still bitter about it. I am not surprised. > I have never encountered a library which does that. Which library was > that? > >> As a matter of API design, I’d rather C library communicated errors via >> return values (and pointer parameters, where more complex error >> information is required). > > Personally, i think that, at least for a library, an error should _only_ > be communicated by return value. If more complex information is required, > then the return value can be made more complex. I don't think i've ever > used a library that communicates information via a pointer parameter. > > One thing i've experienced in multiple libraries is a system a bit like > what errno.h offers, but done via a pair of subroutines that retrieve and > assign to some hidden global variable. I don't like this for the same > reason i don't like subroutines that use errno (unless they're syscall > wrappers), but in at least two of the cases the library has also come with > a way to set a callback subroutine to automatically deal with errors > instead. This is nice, since it means that the code doesn't get all > obfuscated with error checking after every subroutine call, but it's > annoying that each library needs to come with its own unique subroutine > for this, and i do worry about it being overly general in treating all > errors lethally. Error handling in libraries is a thorny subject, and I could go on a long rant on it. The short summary is simply this: error handling is rife with incompetence, and incompetent designs. Even in languages that ostensibly provide better mechanisms than C, the details are usually poorly documented and under-specified [1]. Rest assured that my error handling strategy is /sane/, though for now I'm not explaining my code, nor design. [1] Like in Common Lisp, when you're given a handler that supposedly can handle it in-situ, but then the handler doesn't get enough arguments to do much more than log a generic error, or -- this was a personal favorite -- you have to read the source code to figure out what the error handler gets. -- Johann | email: invalid -> com | www.myrkraverk.com/blog/ I'm not from the Internet, I just work there. | twitter: @myrkraverk
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-09-28 02:18 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <871qejf62x.fsf@bsb.me.uk> |
| In reply to | #176591 |
Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes: > On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote: >> Personally, i think that, at least for a library, an error should _only_ >> be communicated by return value. If more complex information is required, >> then the return value can be made more complex. ... > Error handling in libraries is a thorny subject, and I could go on a > long rant on it. The short summary is simply this: error handling is > rife with incompetence, and incompetent designs. Even in languages that > ostensibly provide better mechanisms than C, the details are usually > poorly documented and under-specified [1]. If it's done well, the result is not even "error handling" -- it's just what the function returns. For example, a lookup in a table of integers in Haskell returns a type that is, in effect "maybe an integer" The return will be either "Nothing" or something like "Just 42". Haskell does not always get it right (particularly some of the older APIs) but the trend is to provide return type rich enough to include either a correct result or an explanation of the fault. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> |
|---|---|
| Date | 2023-09-28 22:30 +0000 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <pan$b83bb$766eb4cb$b79b4acf$fdd8021c@invalid.invalid> |
| In reply to | #176596 |
[This article is a resend because my first try seems to not have worked; if this is duplicated, that's the explanation.] Ben Bacarisse wrote: > Haskell does not always get it right (particularly some of the older > APIs) but the trend is to provide return type rich enough to include > either a correct result or an explanation of the fault. Things like that are pretty much what i was referring to earlier when i referred to making the return type more complex to handle more complex situations. Obviously, it would have to be done differently in C, since C doesn't support tagged unions (at least not natively—i know of a couple libraries that use macro magic to implement them). -- Blue-Maned_HawkÃÃâshortens to HawkÃÃâ/ blu.mÃin.dÃÃÃðak/ ÃÃâhe/him/his/himself/Mr. bluemanedhawk.github.io A flamethrower will not interfere with WiFi unless you aim it directly at the router.
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2023-09-28 22:33 +0000 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <PanRM.9358$Sn81.8479@fx08.iad> |
| In reply to | #176673 |
Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes: >[This article is a resend because my first try seems to not have worked; >if this is duplicated, that's the explanation.] > >Ben Bacarisse wrote: > >> Haskell does not always get it right (particularly some of the older >> APIs) but the trend is to provide return type rich enough to include >> either a correct result or an explanation of the fault. > >Things like that are pretty much what i was referring to earlier when i >referred to making the return type more complex to handle more complex >situations. Obviously, it would have to be done differently in C, since C >doesn't support tagged unions (at least not natively—i know of a couple >libraries that use macro magic to implement them). In C++, a std::pair<bool, return-type> is used in that context. If the sizeof the return type is 64-bits or less, most modern ABI's will return it in a pair of registers.
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@gmail.moc> |
|---|---|
| Date | 2023-09-29 01:41 +0300 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <20230929014147.db836fd718d5d67557fc138c@gmail.moc> |
| In reply to | #176674 |
Scott Lurndal: > In C++, a std::pair<bool, return-type> is used in that > context. If the sizeof the return type is 64-bits or > less, most modern ABI's will return it in a pair of > registers. This groups the error flag and return value, the error message having to be passed somewhere else. In C, one can group the error flag, error code, error message plus any additional error information, in order to return the actual return value. -- () ascii ribbon campaign -- against html e-mail /\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2023-09-28 23:45 +0000 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <8eoRM.229644$vMO8.204792@fx16.iad> |
| In reply to | #176675 |
Anton Shepelev <anton.txt@gmail.moc> writes: >Scott Lurndal: > >> In C++, a std::pair<bool, return-type> is used in that >> context. If the sizeof the return type is 64-bits or >> less, most modern ABI's will return it in a pair of >> registers. > >This groups the error flag and return value, the error >message having to be passed somewhere else. In C, one can >group the error flag, error code, error message plus any >additional error information, in order to return the actual >return value. In C++ you can group any number of objects into a struct and return that, if you need to. I wouldn't return a message from the function, an error code that maps into a message is often sufficient. One could take a leaf from the VMS book and replace the bool with an error code (which maps externally into a locale-specific string), where "SS$_NORMAL" indicates that the operation was successful, and the range less than MAX(defined errno) is reserved for strerror() messages and the range above some value is reserved to the application.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2023-09-28 17:35 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <87edihstna.fsf@nosuchdomain.example.com> |
| In reply to | #176674 |
scott@slp53.sl.home (Scott Lurndal) writes:
> Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes:
>>Ben Bacarisse wrote:
>>> Haskell does not always get it right (particularly some of the older
>>> APIs) but the trend is to provide return type rich enough to include
>>> either a correct result or an explanation of the fault.
>>
>>Things like that are pretty much what i was referring to earlier when i
>>referred to making the return type more complex to handle more complex
>>situations. Obviously, it would have to be done differently in C, since C
>>doesn't support tagged unions (at least not natively—i know of a couple
>>libraries that use macro magic to implement them).
>
> In C++, a std::pair<bool, return-type> is used in that context. If the sizeof
> the return type is 64-bits or less, most modern ABI's will return it in a pair
> of registers.
<OT>
A std::pair<bool, return-type> gives you either a true value and a value
of the return type or a false value and a value of the return type.
There's no indication in the type itself that the second member is
meaningless if the first is false.
std::optional<return-type> is a better fit -- but it isn't available if
you have to deal with pre-C++17 compilers.
</OT>
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2023-09-29 16:49 +0200 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <uf6o5c$ark0$1@dont-email.me> |
| In reply to | #176687 |
On 29/09/2023 02:35, Keith Thompson wrote: > scott@slp53.sl.home (Scott Lurndal) writes: >> Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> writes: >>> Ben Bacarisse wrote: >>>> Haskell does not always get it right (particularly some of the older >>>> APIs) but the trend is to provide return type rich enough to include >>>> either a correct result or an explanation of the fault. >>> >>> Things like that are pretty much what i was referring to earlier when i >>> referred to making the return type more complex to handle more complex >>> situations. Obviously, it would have to be done differently in C, since C >>> doesn't support tagged unions (at least not natively—i know of a couple >>> libraries that use macro magic to implement them). >> >> In C++, a std::pair<bool, return-type> is used in that context. If the sizeof >> the return type is 64-bits or less, most modern ABI's will return it in a pair >> of registers. > > <OT> > A std::pair<bool, return-type> gives you either a true value and a value > of the return type or a false value and a value of the return type. > There's no indication in the type itself that the second member is > meaningless if the first is false. > > std::optional<return-type> is a better fit -- but it isn't available if > you have to deal with pre-C++17 compilers. > </OT> > <OT> std::optional<> is a standard library template that is based around a struct containing a bool and a return type, just like a std::pair<bool, return-type> would be. I don't think it needs C++17 to implement it - I think you could make a reasonable bash at making an "optional" template even in pre-C++11 C++. (I believe boost::optional works with C++03.) But there are probably a number of nuances that make it better in modern C++, and of course it is far more convenient when it is already in the library. And certainly if you are using C++17, it's a better choice here than a bare pair. </OT>
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-09-28 18:24 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <86y1gphiur.fsf@linuxsc.com> |
| In reply to | #176596 |
Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes: > >> On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote: >> >>> Personally, i think that, at least for a library, an error >>> should _only_ be communicated by return value. If more complex >>> information is required, then the return value can be made more >>> complex. > > ... > >> Error handling in libraries is a thorny subject, and I could go >> on a long rant on it. The short summary is simply this: error >> handling is rife with incompetence, and incompetent designs. >> Even in languages that ostensibly provide better mechanisms than >> C, the details are usually poorly documented and under-specified >> [1]. > > If it's done well, the result is not even "error handling" -- it's > just what the function returns. For example, a lookup in a table > of integers in Haskell returns a type that is, in effect "maybe an > integer" The return will be either "Nothing" or something like > "Just 42". In languages that support it this approach is a good way to deal with cases like this, I agree. But to me it falls in a different category than error handling. An error condition is something like an allocation (eg malloc()) failure, or a broken pipe. Not finding something in a table means whoever was responsible for adding things to the table didn't add it - in other words it was entirely a consequence of events that are under the program's control. The issue is how to deal with an unpredictable, and usually very rare, event that is not something the program has control over. Typically these kinds of situations need a different sort of mechanism than loading up the return value of every function call.
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-09-29 03:19 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <877co9d8m2.fsf@bsb.me.uk> |
| In reply to | #176692 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > >> Johann 'Myrkraverk' Oskarsson <johann@myrkraverk.invalid> writes: >> >>> On 9/28/2023 6:52 AM, Blue-Maned_Hawk wrote: >>> >>>> Personally, i think that, at least for a library, an error >>>> should _only_ be communicated by return value. If more complex >>>> information is required, then the return value can be made more >>>> complex. >> >> ... >> >>> Error handling in libraries is a thorny subject, and I could go >>> on a long rant on it. The short summary is simply this: error >>> handling is rife with incompetence, and incompetent designs. >>> Even in languages that ostensibly provide better mechanisms than >>> C, the details are usually poorly documented and under-specified >>> [1]. >> >> If it's done well, the result is not even "error handling" -- it's >> just what the function returns. For example, a lookup in a table >> of integers in Haskell returns a type that is, in effect "maybe an >> integer" The return will be either "Nothing" or something like >> "Just 42". > > In languages that support it this approach is a good way to deal > with cases like this, I agree. But to me it falls in a different > category than error handling. An error condition is something > like an allocation (eg malloc()) failure, or a broken pipe. Not > finding something in a table means whoever was responsible for > adding things to the table didn't add it - in other words it was > entirely a consequence of events that are under the program's > control. The issue is how to deal with an unpredictable, and > usually very rare, event that is not something the program has > control over. Typically these kinds of situations need a > different sort of mechanism than loading up the return value > of every function call. I agree that table lookup is not a compelling example, but I don't agree that anything different is needed. I intended the idea to be generalised to have return values used for all error conditions. There may be situations where something different is /better/, but I remain sceptical of even this claim. Note I am not talking about situations where asynchronous errors need to be reported. C uses null pointers for this all the time. Returning a null pointer is just a way to signal an error through the return value. In a language like C without null pointers I would want malloc to return a 'Maybe' type. And for when more information is wanted (like your example of a broken pipe) the type returned from write should be a number of bytes or an error value of some sort. Obviously it's hard to recommend this in C since the language can't really express the patterns needed to make it convenient, but I think we are talking more generally here. How would you prefer these sorts of thing to be signalled? BTW, have you come across Icon? It takes an intriguing approach where every operation succeeds or fails as well as having a value. This idea is heavily built upon to produce a very comfortable scripting language. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-09-29 07:46 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <86bkdlghqr.fsf@linuxsc.com> |
| In reply to | #176698 |
Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >> [..how should errors be handled..] >>> If it's done well, the result is not even "error handling" -- it's >>> just what the function returns. For example, a lookup in a table >>> of integers in Haskell returns a type that is, in effect "maybe an >>> integer" The return will be either "Nothing" or something like >>> "Just 42". >> >> In languages that support it this approach is a good way to deal >> with cases like this, I agree. But to me it falls in a different >> category than error handling. An error condition is something >> like an allocation (eg malloc()) failure, or a broken pipe. Not >> finding something in a table means whoever was responsible for >> adding things to the table didn't add it - in other words it was >> entirely a consequence of events that are under the program's >> control. The issue is how to deal with an unpredictable, and >> usually very rare, event that is not something the program has >> control over. Typically these kinds of situations need a >> different sort of mechanism than loading up the return value >> of every function call. > > I agree that table lookup is not a compelling example, but I don't > agree that anything different is needed. I intended the idea to > be generalised to have return values used for all error > conditions. There may be situations where something different is > /better/, but I remain sceptical of even this claim. Note I am > not talking about situations where asynchronous errors need to be > reported. > > C uses null pointers for this all the time. Returning a null > pointer is just a way to signal an error through the return value. > In a language like C without null pointers I would want malloc to > return a 'Maybe' type. And for when more information is wanted > (like your example of a broken pipe) the type returned from write > should be a number of bytes or an error value of some sort. > > Obviously it's hard to recommend this in C since the language > can't really express the patterns needed to make it convenient, > but I think we are talking more generally here. I acknowledge your statement about not considering asynchronous errors. My own comments are likewise. I agree that all unusual result circumstances (which might be labeled "error conditions", without meaning to prejudice the term) can be handled locally via either compound return values or multiple return values (eg, by using pointer parameters), or a combination of the two. In languages that provide good support for compound return values, which I would say C does not, probably a single return value (which is perhaps a compound value) always suffices, but I haven't thought very carefully about that. But I think we are in general agreement that it's always possible to handle error conditions via local return values. Where we might not agree is whether using direct return values is always a good choice. In my view, sometimes it is, sometimes it isn't - it depends on the particular situation. That is definitely the case for C, but I think also for other programming languages that have better support for multiple return values or compound return values, or both. And certainly if needing other mechanisms is true for languages in general then it we would expect it is true for C. > How would you prefer these sorts of thing to be signalled? Let me give just one example. kI wrote some code not long ago to add a value to a set of values, where the set is represented using a recursive binary tree structure (like red-black trees, but a little different). Usually we may expect that a request to add a value would offer a value not already included in the set, but certainly it can happen that there is a call to add a value that is already included, in which case the top level return value should be the original tree structure. I thought it would be easier to write a simple recursive routine that assumes the to-be-added value is not yet included in the set, and if that assumption is violated raise an exception that is caught at the outermost level and simply returns the original argument tree value. Certainly the code could have been written to handle the already-present situation at each level of the recursion, but it was much easier and much cleaner to handle it by raising an exception. That example doesn't translate to C in any obvious way, because C code to add items to a set would very likely be written rather differently. I think it's harder to make use of non-local returns in C than in many other languages, because C requires more thought and discipline to set that up. At the same time, when I look at C code that tries to handle all exceptional conditions locally, I can't help but think the code would be simplified overall if code for dealing with exceptional conditions could be removed in the code generally and instead used a different mechanism, in fewer places, and perhaps similar to raising an exception, to handle such cases. > BTW, have you come across Icon? I have, and it's a fascinating language. > It takes an intriguing approach > where every operation succeeds or fails as well as having a value. I have heard Icon characterized as saying "it tries to succeed", and I think that's a fair description. The idea of integrating a backtracking engine into the language semantics deserves more attention than I think it has been given, which is unfortunate. > This idea is heavily built upon to produce a very comfortable > scripting language. My experience with Icon is purely academic, which is to say I have read about it but never used it. I have more experience with Prolog, which is similar in some ways (notably backtracking) but of course very different in other ways. I'm intrigued by your comment that Icon makes a good scripting language. Maybe I don't understand what you think makes a good scripting language, or even what "scripting language" means. That subject is not topical in comp.lang.c, but I also peruse comp.lang.misc and comp.programming if you wouldn't mind continuing there. Also you are welcome to send me an email at this address if you would rather do that.
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-09-29 21:02 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <8734ywbvek.fsf@bsb.me.uk> |
| In reply to | #176735 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > >> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >> >>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>> > > [..how should errors be handled..] > >>>> If it's done well, the result is not even "error handling" -- it's >>>> just what the function returns. For example, a lookup in a table >>>> of integers in Haskell returns a type that is, in effect "maybe an >>>> integer" The return will be either "Nothing" or something like >>>> "Just 42". >>> >>> In languages that support it this approach is a good way to deal >>> with cases like this, I agree. But to me it falls in a different >>> category than error handling. An error condition is something >>> like an allocation (eg malloc()) failure, or a broken pipe. Not >>> finding something in a table means whoever was responsible for >>> adding things to the table didn't add it - in other words it was >>> entirely a consequence of events that are under the program's >>> control. The issue is how to deal with an unpredictable, and >>> usually very rare, event that is not something the program has >>> control over. Typically these kinds of situations need a >>> different sort of mechanism than loading up the return value >>> of every function call. >> >> I agree that table lookup is not a compelling example, but I don't >> agree that anything different is needed. I intended the idea to >> be generalised to have return values used for all error >> conditions. There may be situations where something different is >> /better/, but I remain sceptical of even this claim. Note I am >> not talking about situations where asynchronous errors need to be >> reported. >> >> C uses null pointers for this all the time. Returning a null >> pointer is just a way to signal an error through the return value. >> In a language like C without null pointers I would want malloc to >> return a 'Maybe' type. And for when more information is wanted >> (like your example of a broken pipe) the type returned from write >> should be a number of bytes or an error value of some sort. >> >> Obviously it's hard to recommend this in C since the language >> can't really express the patterns needed to make it convenient, >> but I think we are talking more generally here. > > I acknowledge your statement about not considering asynchronous > errors. My own comments are likewise. > > I agree that all unusual result circumstances (which might be > labeled "error conditions", without meaning to prejudice the term) > can be handled locally via either compound return values or multiple > return values (eg, by using pointer parameters), or a combination of > the two. In languages that provide good support for compound return > values, which I would say C does not, probably a single return value > (which is perhaps a compound value) always suffices, but I haven't > thought very carefully about that. But I think we are in general > agreement that it's always possible to handle error conditions via > local return values. > > Where we might not agree is whether using direct return values is > always a good choice. In my view, sometimes it is, sometimes it > isn't - it depends on the particular situation. That is definitely > the case for C, but I think also for other programming languages > that have better support for multiple return values or compound > return values, or both. And certainly if needing other mechanisms > is true for languages in general then it we would expect it is true > for C. > >> How would you prefer these sorts of thing to be signalled? > > Let me give just one example. > > kI wrote some code not long ago to add a value to a set of values, > where the set is represented using a recursive binary tree structure > (like red-black trees, but a little different). Usually we may > expect that a request to add a value would offer a value not already > included in the set, but certainly it can happen that there is a > call to add a value that is already included, in which case the top > level return value should be the original tree structure. > > I thought it would be easier to write a simple recursive routine > that assumes the to-be-added value is not yet included in the set, > and if that assumption is violated raise an exception that is caught > at the outermost level and simply returns the original argument tree > value. Certainly the code could have been written to handle the > already-present situation at each level of the recursion, but it was > much easier and much cleaner to handle it by raising an exception. This is one of those cases where I just have to take your word for it, (and I am happy to do that). > That example doesn't translate to C in any obvious way, because C > code to add items to a set would very likely be written rather > differently. I think it's harder to make use of non-local returns > in C than in many other languages, because C requires more thought > and discipline to set that up. At the same time, when I look at C > code that tries to handle all exceptional conditions locally, I > can't help but think the code would be simplified overall if code > for dealing with exceptional conditions could be removed in the code > generally and instead used a different mechanism, in fewer places, > and perhaps similar to raising an exception, to handle such cases. You may be right, but I don't like exceptions. More specifically, I don't like exceptions that are not handled internally in some API or other (so your tree example sounds like a sane use for them), but I worry about their wider use. You either have to wrap every call in a "try" (in which case why not use the multiple value return method) or the reasoning about correctness starts to get more and more complicated. >> BTW, have you come across Icon? > > I have, and it's a fascinating language. > >> It takes an intriguing approach >> where every operation succeeds or fails as well as having a value. > > I have heard Icon characterized as saying "it tries to succeed", > and I think that's a fair description. The idea of integrating > a backtracking engine into the language semantics deserves more > attention than I think it has been given, which is unfortunate. > >> This idea is heavily built upon to produce a very comfortable >> scripting language. > > My experience with Icon is purely academic, which is to say I have > read about it but never used it. I have more experience with > Prolog, which is similar in some ways (notably backtracking) but of > course very different in other ways. > > I'm intrigued by your comment that Icon makes a good scripting > language. Maybe I don't understand what you think makes a good > scripting language, or even what "scripting language" means. > That subject is not topical in comp.lang.c, but I also peruse > comp.lang.misc and comp.programming if you wouldn't mind > continuing there. Also you are welcome to send me an email > at this address if you would rather do that. It was a careless remark with no significant meaning. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-09-29 21:56 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <86cyy0fee5.fsf@linuxsc.com> |
| In reply to | #176759 |
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>> [..how should errors be handled..]
>>
>>> How would you prefer these sorts of thing to be signalled?
>>
>> Let me give just one example.
>>
>> I wrote some code not long ago to add a value to a set of values,
>> where the set is represented using a recursive binary tree structure
>> (like red-black trees, but a little different). Usually we may
>> expect that a request to add a value would offer a value not already
>> included in the set, but certainly it can happen that there is a
>> call to add a value that is already included, in which case the top
>> level return value should be the original tree structure.
>>
>> I thought it would be easier to write a simple recursive routine
>> that assumes the to-be-added value is not yet included in the set,
>> and if that assumption is violated raise an exception that is caught
>> at the outermost level and simply returns the original argument tree
>> value. Certainly the code could have been written to handle the
>> already-present situation at each level of the recursion, but it was
>> much easier and much cleaner to handle it by raising an exception.
>
> This is one of those cases where I just have to take your word for it,
> (and I am happy to do that).
Responding here to just this one part.
Here is a simpler version of the code, for an ordinary
binary tree, and without any rebalancing:
module TreeSet = struct
type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
exception Present
let add tree element =
let rec add t e =
match t with
| Empty -> Node( Empty, e, Empty )
| Node( a, v, b ) ->
if e < v then Node( add a e, v, b ) else
if e > v then Node( a, v, add b e ) else
raise Present
in
try add tree element with Present -> tree
end
Transliterating that into C, the central functions might look like
this (I have left out the type definition and the constructor
function used to build a new node value):
static Tree add( jmp_buf, Tree, double );
static Tree raise_present( jmp_buf );
static Tree new_node( Tree, double, Tree );
Tree
add_element( Tree tree, double new_element ){
jmp_buf jb;
if( setjmp( jb ) ) return tree;
return add( jb, tree, new_element );
}
Tree
add( jmp_buf jb, Tree t, double d ){
return
! t ? new_node( 0, d, 0 )
: d < t->v ? new_node( add( jb, t->a, d ), t->v, t->b )
: d > t->v ? new_node( t->a, t->v, add( jb, t->b, d ) )
: /*********/ raise_present( jb );
}
Tree
raise_present( jmp_buf jb ){
longjmp( jb, 1 );
}
Of course it's unlikely that sets would be implemented this way
in C, but that's not the point; the point is that exceptions
can result in code that is cleaner and easier to understand
than an alternate approach where return values would have to
be checked at every level.
(Incidentally, I want to acknowlege you comment about taking my
word for the result. I still thought it would be good to give
a concrete example, even if it is a very simplified one.)
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-10-01 00:06 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <87ttrbclcj.fsf@bsb.me.uk> |
| In reply to | #176778 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > >> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >> >>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>> >>> [..how should errors be handled..] >>> >>>> How would you prefer these sorts of thing to be signalled? >>> >>> Let me give just one example. >>> >>> I wrote some code not long ago to add a value to a set of values, >>> where the set is represented using a recursive binary tree structure >>> (like red-black trees, but a little different). Usually we may >>> expect that a request to add a value would offer a value not already >>> included in the set, but certainly it can happen that there is a >>> call to add a value that is already included, in which case the top >>> level return value should be the original tree structure. >>> >>> I thought it would be easier to write a simple recursive routine >>> that assumes the to-be-added value is not yet included in the set, >>> and if that assumption is violated raise an exception that is caught >>> at the outermost level and simply returns the original argument tree >>> value. Certainly the code could have been written to handle the >>> already-present situation at each level of the recursion, but it was >>> much easier and much cleaner to handle it by raising an exception. >> >> This is one of those cases where I just have to take your word for it, >> (and I am happy to do that). > > Responding here to just this one part. > > Here is a simpler version of the code, for an ordinary > binary tree, and without any rebalancing: > > module TreeSet = struct > type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset > exception Present > > let add tree element = > let rec add t e = > match t with > | Empty -> Node( Empty, e, Empty ) > | Node( a, v, b ) -> > if e < v then Node( add a e, v, b ) else > if e > v then Node( a, v, add b e ) else > raise Present > in > try add tree element with Present -> tree > > end I'm not getting it. I'd write 'else t' rather than 'else raise Present' and I could then do away with the wrapping 'add' function. I worry we are getting away from C, but the point is likely to be much simpler to make in OCaml (or Caml). -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-09-30 20:35 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <86r0mfdng1.fsf@linuxsc.com> |
| In reply to | #176842 |
Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >> >>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >>> >>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>>> >>>> [..how should errors be handled..] >>>> >>>>> How would you prefer these sorts of thing to be signalled? >>>> >>>> Let me give just one example. >>>> >>>> I wrote some code not long ago to add a value to a set of values, >>>> where the set is represented using a recursive binary tree structure >>>> (like red-black trees, but a little different). Usually we may >>>> expect that a request to add a value would offer a value not already >>>> included in the set, but certainly it can happen that there is a >>>> call to add a value that is already included, in which case the top >>>> level return value should be the original tree structure. >>>> >>>> I thought it would be easier to write a simple recursive routine >>>> that assumes the to-be-added value is not yet included in the set, >>>> and if that assumption is violated raise an exception that is caught >>>> at the outermost level and simply returns the original argument tree >>>> value. Certainly the code could have been written to handle the >>>> already-present situation at each level of the recursion, but it was >>>> much easier and much cleaner to handle it by raising an exception. >>> >>> This is one of those cases where I just have to take your word for it, >>> (and I am happy to do that). >> >> Responding here to just this one part. >> >> Here is a simpler version of the code, for an ordinary >> binary tree, and without any rebalancing: >> >> module TreeSet = struct >> type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset >> exception Present >> >> let add tree element = >> let rec add t e = >> match t with >> | Empty -> Node( Empty, e, Empty ) >> | Node( a, v, b ) -> >> if e < v then Node( add a e, v, b ) else >> if e > v then Node( a, v, add b e ) else >> raise Present >> in >> try add tree element with Present -> tree >> >> end > > I'm not getting it. I'd write 'else t' rather than 'else raise > Present' and I could then do away with the wrapping 'add' function. Doing that would result in returning a valid tree, but it would also needlessly replicate all nodes starting from the point where the match was found and going up to the root. The consequence is more cycles used and more demand on the memory allocator. The technique of raising an exception avoids those shortcomings (and also it preserves tree identity, which can be important in some situations). > I worry we are getting away from C, but the point is likely to be > much simpler to make in OCaml (or Caml). I have another example. The example doesn't include any C code, but I will talk about some concerns that pertain to C. Suppose we are writing a compiler for a large programming language. The compiler should make use of available resources, especially memory resources, when it can, but also should be able to compile large programs even when the amount of memory available is much more limited. How much memory is available is not known in advance; we find out when a call to malloc() or realloc() returns NULL. I note in passing that the PL/I(F) compiler from IBM could translate full PL/I even if limited to only 44 K bytes of user memory; to do that it needed 110 physical passes over source text or structures created and stored in disk files. The idea in essence is to have two compilers in one: one taking a conventional approach using regular dynamic memory allocation (in other words malloc() etc), and the other using an IBM-style multiple passes scheme that stores intermediate data in disk files. To compile a program we start the first compiler under an umbrella of setjmp(), which going forward assumes all memory allocations will succeed. To allocate memory we call wrapper functions for malloc() or realloc(), which will do a longjmp() if an allocation fails. Upon receipt of the longjmp() "exception" the outer setjmp() call starts over using the second approach, which is of course much slower but also much less demanding of main memory. To make this work we might have to track malloc()'s and free()'s so that any not-yet-released memory can be reclaimed before beginning the alternate system, but I think that is straightforward and not in need of any further explanation. I admit this example isn't very compelling given that even small computers today have multiple gigabytes of RAM. But I think it isn't hard to imagine an analogous problem in a more limited environment, for example running in a VM on a co-located server. (I have run out of "RAM" when running a program on my colo server here. It was quite confusing when it first happened.)
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@gmail.moc> |
|---|---|
| Date | 2023-10-01 14:44 +0300 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <20231001144405.3f6863a9af57ee632caaf74d@gmail.moc> |
| In reply to | #176850 |
Tim Rentsch: > The idea in essence is to have two compilers in one: one > taking a conventional approach using regular dynamic > memory allocation (in other words malloc() etc), and the > other using an IBM-style multiple passes scheme that > stores intermediate data in disk files. To compile a > program we start the first compiler under an umbrella of > setjmp(), which going forward assumes all memory > allocations will succeed. To allocate memory we call > wrapper functions for malloc() or realloc(), which will do > a longjmp() if an allocation fails. Upon receipt of the > longjmp() "exception" the outer setjmp() call starts over > using the second approach, which is of course much slower > but also much less demanding of main memory. > [...] > I admit this example isn't very compelling given that even > small computers today have multiple gigabytes of RAM. No, the availablity of lots of RAM in modern computers in mere incidental to your example and in no way affects it. How about this example: a recursive-descent parser for a deeply nested grammar. How does one handle and reort an error encountered ten invocations underground? Is it a simpler example of the same problem? -- () ascii ribbon campaign -- against html e-mail /\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-10-01 15:45 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <87il7qcsfz.fsf@bsb.me.uk> |
| In reply to | #176850 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > >> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >> >>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>> >>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >>>> >>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>>>> >>>>> [..how should errors be handled..] >>>>> >>>>>> How would you prefer these sorts of thing to be signalled? >>>>> >>>>> Let me give just one example. >>>>> >>>>> I wrote some code not long ago to add a value to a set of values, >>>>> where the set is represented using a recursive binary tree structure >>>>> (like red-black trees, but a little different). Usually we may >>>>> expect that a request to add a value would offer a value not already >>>>> included in the set, but certainly it can happen that there is a >>>>> call to add a value that is already included, in which case the top >>>>> level return value should be the original tree structure. >>>>> >>>>> I thought it would be easier to write a simple recursive routine >>>>> that assumes the to-be-added value is not yet included in the set, >>>>> and if that assumption is violated raise an exception that is caught >>>>> at the outermost level and simply returns the original argument tree >>>>> value. Certainly the code could have been written to handle the >>>>> already-present situation at each level of the recursion, but it was >>>>> much easier and much cleaner to handle it by raising an exception. >>>> >>>> This is one of those cases where I just have to take your word for it, >>>> (and I am happy to do that). >>> >>> Responding here to just this one part. >>> >>> Here is a simpler version of the code, for an ordinary >>> binary tree, and without any rebalancing: >>> >>> module TreeSet = struct >>> type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset >>> exception Present >>> >>> let add tree element = >>> let rec add t e = >>> match t with >>> | Empty -> Node( Empty, e, Empty ) >>> | Node( a, v, b ) -> >>> if e < v then Node( add a e, v, b ) else >>> if e > v then Node( a, v, add b e ) else >>> raise Present >>> in >>> try add tree element with Present -> tree >>> >>> end >> >> I'm not getting it. I'd write 'else t' rather than 'else raise >> Present' and I could then do away with the wrapping 'add' function. > > Doing that would result in returning a valid tree, but it would also > needlessly replicate all nodes starting from the point where the > match was found and going up to the root. The consequence is more > cycles used and more demand on the memory allocator. How can OCaml avoid making the allocations until it knows the exception won't be raised? Are they not all still being done? > The technique > of raising an exception avoids those shortcomings (and also it > preserves tree identity, which can be important in some situations). I agree about the identity. I am not used to relying on the identity in situations like this (probably because of Haskell!) but I might find some time to work out how to preserve identity without an exception. I fear it will be messier that your simple solution. >> I worry we are getting away from C, but the point is likely to be >> much simpler to make in OCaml (or Caml). > > I have another example. The example doesn't include any C code, but > I will talk about some concerns that pertain to C. > > Suppose we are writing a compiler for a large programming language. > The compiler should make use of available resources, especially > memory resources, when it can, but also should be able to compile > large programs even when the amount of memory available is much more > limited. How much memory is available is not known in advance; we > find out when a call to malloc() or realloc() returns NULL. I note > in passing that the PL/I(F) compiler from IBM could translate full > PL/I even if limited to only 44 K bytes of user memory; to do that > it needed 110 physical passes over source text or structures created > and stored in disk files. > > The idea in essence is to have two compilers in one: one taking a > conventional approach using regular dynamic memory allocation (in > other words malloc() etc), and the other using an IBM-style multiple > passes scheme that stores intermediate data in disk files. To > compile a program we start the first compiler under an umbrella of > setjmp(), which going forward assumes all memory allocations will > succeed. To allocate memory we call wrapper functions for malloc() > or realloc(), which will do a longjmp() if an allocation fails. > Upon receipt of the longjmp() "exception" the outer setjmp() call > starts over using the second approach, which is of course much > slower but also much less demanding of main memory. > > To make this work we might have to track malloc()'s and free()'s so > that any not-yet-released memory can be reclaimed before beginning > the alternate system, but I think that is straightforward and not in > need of any further explanation. I am not convinced it's neater to use longjmp here. I prefer to write code where allocation failures just propagate up the call stack. That would give me a top-level failure return from the try_fast_compile() function call. > I admit this example isn't very compelling given that even small > computers today have multiple gigabytes of RAM. But I think it > isn't hard to imagine an analogous problem in a more limited > environment, for example running in a VM on a co-located server. > (I have run out of "RAM" when running a program on my colo server > here. It was quite confusing when it first happened.) -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-10-01 09:28 -0700 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <86ediee28o.fsf@linuxsc.com> |
| In reply to | #176866 |
Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>
>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>
>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>>
>>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>>
>>>>>> [..how should errors be handled..]
>>>>>>
>>>>>>> How would you prefer these sorts of thing to be signalled?
>>>>>>
>>>>>> Let me give just one example.
>>>>>>
>>>>>> I wrote some code not long ago to add a value to a set of values,
>>>>>> where the set is represented using a recursive binary tree structure
>>>>>> (like red-black trees, but a little different). Usually we may
>>>>>> expect that a request to add a value would offer a value not already
>>>>>> included in the set, but certainly it can happen that there is a
>>>>>> call to add a value that is already included, in which case the top
>>>>>> level return value should be the original tree structure.
>>>>>>
>>>>>> I thought it would be easier to write a simple recursive routine
>>>>>> that assumes the to-be-added value is not yet included in the set,
>>>>>> and if that assumption is violated raise an exception that is caught
>>>>>> at the outermost level and simply returns the original argument tree
>>>>>> value. Certainly the code could have been written to handle the
>>>>>> already-present situation at each level of the recursion, but it was
>>>>>> much easier and much cleaner to handle it by raising an exception.
>>>>>
>>>>> This is one of those cases where I just have to take your word for it,
>>>>> (and I am happy to do that).
>>>>
>>>> Responding here to just this one part.
>>>>
>>>> Here is a simpler version of the code, for an ordinary
>>>> binary tree, and without any rebalancing:
>>>>
>>>> module TreeSet = struct
>>>> type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>>> exception Present
>>>>
>>>> let add tree element =
>>>> let rec add t e =
>>>> match t with
>>>> | Empty -> Node( Empty, e, Empty )
>>>> | Node( a, v, b ) ->
>>>> if e < v then Node( add a e, v, b ) else
>>>> if e > v then Node( a, v, add b e ) else
>>>> raise Present
>>>> in
>>>> try add tree element with Present -> tree
>>>>
>>>> end
>>>
>>> I'm not getting it. I'd write 'else t' rather than 'else raise
>>> Present' and I could then do away with the wrapping 'add' function.
>>
>> Doing that would result in returning a valid tree, but it would also
>> needlessly replicate all nodes starting from the point where the
>> match was found and going up to the root. The consequence is more
>> cycles used and more demand on the memory allocator.
>
> How can OCaml avoid making the allocations until it knows the exception
> won't be raised? Are they not all still being done?
No, no allocations are done if the to-be-added element is found
(and so an exception is raised). Consider one of the lines where a
Node is being constructed (which is synonymous with an allocation
occurring):
if e < v then Node( add a e, v, b ) else
The call to the Node() constructor doesn't take place until the
recursive call to 'add' returns. But if an equal value is found,
then an exception is raised, and the call to 'add' does not /ever/
return. No return from 'add' means no Node() is constructed.
To say this another way, looking for a point where an insertion
should take place, or where there is an equal value, happens going
/down/ the call chain. Node() allocations, however, happen only
when we are coming back /up/ the call chain. Raising an exception
"short circuits" all of those pending returns: they never happen,
and so no Node()s are constructed.
>> The technique
>> of raising an exception avoids those shortcomings (and also it
>> preserves tree identity, which can be important in some situations).
>
> I agree about the identity. I am not used to relying on the identity in
> situations like this (probably because of Haskell!) but I might find
> some time to work out how to preserve identity without an exception. I
> fear it will be messier that your simple solution.
SPOILER ALERT - at the end of this posting I am including code
for a version of add (called add') that does not use exceptions
and preserves identity when the "new" element is already present.
>>> I worry we are getting away from C, but the point is likely to be
>>> much simpler to make in OCaml (or Caml).
>>
>> I have another example. The example doesn't include any C code, but
>> I will talk about some concerns that pertain to C.
>>
>> Suppose we are writing a compiler for a large programming language.
>> The compiler should make use of available resources, especially
>> memory resources, when it can, but also should be able to compile
>> large programs even when the amount of memory available is much more
>> limited. How much memory is available is not known in advance; we
>> find out when a call to malloc() or realloc() returns NULL. I note
>> in passing that the PL/I(F) compiler from IBM could translate full
>> PL/I even if limited to only 44 K bytes of user memory; to do that
>> it needed 110 physical passes over source text or structures created
>> and stored in disk files.
>>
>> The idea in essence is to have two compilers in one: one taking a
>> conventional approach using regular dynamic memory allocation (in
>> other words malloc() etc), and the other using an IBM-style multiple
>> passes scheme that stores intermediate data in disk files. To
>> compile a program we start the first compiler under an umbrella of
>> setjmp(), which going forward assumes all memory allocations will
>> succeed. To allocate memory we call wrapper functions for malloc()
>> or realloc(), which will do a longjmp() if an allocation fails.
>> Upon receipt of the longjmp() "exception" the outer setjmp() call
>> starts over using the second approach, which is of course much
>> slower but also much less demanding of main memory.
>>
>> To make this work we might have to track malloc()'s and free()'s so
>> that any not-yet-released memory can be reclaimed before beginning
>> the alternate system, but I think that is straightforward and not in
>> need of any further explanation.
>
> I am not convinced it's neater to use longjmp here. I prefer to write
> code where allocation failures just propagate up the call stack. That
> would give me a top-level failure return from the try_fast_compile()
> function call.
There was a time when I would normally handle unusual conditions
locally, and pooh-poohed using exceptions (and setjmp()/longjmp()).
Working in OCaml has caused me (I speculate) to revise my thoughts
on the question.
SPOILER ALERT - continue scrolling only when ready to see the code
for an exception-free version of the add function...
Here it is...
let add' tree element =
let rec add t e =
match t with
| Empty -> Some (Node( Empty, e, Empty ))
| Node( a, v, b ) ->
if e < v then (
match add a e with
| None -> None
| Some a' -> Some( Node( a', v, b ) )
) else if e > v then (
match add b e with
| None -> None
| Some b' -> Some( Node( a, v, b' ) )
) else None
in
match add tree element with
| Some t -> t
| None -> tree
The original version is 10 lines, the revised version 18 lines.
That ratio is higher than what I would expect for an analogous
change to the full version (which is much longer than the simple
non-rebalancing version, because it has more cases to deal with,
and needs to keep the tree balanced), but not a lot higher - maybe
only 40 or 50% longer instead of 80% longer.
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-10-01 20:49 +0100 |
| Subject | Re: Libraries using longjmp for error handling |
| Message-ID | <874jjaced5.fsf@bsb.me.uk> |
| In reply to | #176876 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > Ben Bacarisse <ben.usenet@bsb.me.uk> writes: > >> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >> >>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>> >>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >>>> >>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>>>> >>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes: >>>>>> >>>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes: >>>>>>> >>>>>>> [..how should errors be handled..] >>>>>>> >>>>>>>> How would you prefer these sorts of thing to be signalled? >>>>>>> >>>>>>> Let me give just one example. >>>>>>> >>>>>>> I wrote some code not long ago to add a value to a set of values, >>>>>>> where the set is represented using a recursive binary tree structure >>>>>>> (like red-black trees, but a little different). Usually we may >>>>>>> expect that a request to add a value would offer a value not already >>>>>>> included in the set, but certainly it can happen that there is a >>>>>>> call to add a value that is already included, in which case the top >>>>>>> level return value should be the original tree structure. >>>>>>> >>>>>>> I thought it would be easier to write a simple recursive routine >>>>>>> that assumes the to-be-added value is not yet included in the set, >>>>>>> and if that assumption is violated raise an exception that is caught >>>>>>> at the outermost level and simply returns the original argument tree >>>>>>> value. Certainly the code could have been written to handle the >>>>>>> already-present situation at each level of the recursion, but it was >>>>>>> much easier and much cleaner to handle it by raising an exception. >>>>>> >>>>>> This is one of those cases where I just have to take your word for it, >>>>>> (and I am happy to do that). >>>>> >>>>> Responding here to just this one part. >>>>> >>>>> Here is a simpler version of the code, for an ordinary >>>>> binary tree, and without any rebalancing: >>>>> >>>>> module TreeSet = struct >>>>> type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset >>>>> exception Present >>>>> >>>>> let add tree element = >>>>> let rec add t e = >>>>> match t with >>>>> | Empty -> Node( Empty, e, Empty ) >>>>> | Node( a, v, b ) -> >>>>> if e < v then Node( add a e, v, b ) else >>>>> if e > v then Node( a, v, add b e ) else >>>>> raise Present >>>>> in >>>>> try add tree element with Present -> tree >>>>> >>>>> end >>>> >>>> I'm not getting it. I'd write 'else t' rather than 'else raise >>>> Present' and I could then do away with the wrapping 'add' function. >>> >>> Doing that would result in returning a valid tree, but it would also >>> needlessly replicate all nodes starting from the point where the >>> match was found and going up to the root. The consequence is more >>> cycles used and more demand on the memory allocator. >> >> How can OCaml avoid making the allocations until it knows the exception >> won't be raised? Are they not all still being done? > > No, no allocations are done if the to-be-added element is found > (and so an exception is raised). Consider one of the lines where a > Node is being constructed (which is synonymous with an allocation > occurring): > > if e < v then Node( add a e, v, b ) else > > The call to the Node() constructor doesn't take place until the > recursive call to 'add' returns. But if an equal value is found, > then an exception is raised, and the call to 'add' does not /ever/ > return. No return from 'add' means no Node() is constructed. Ah, yes. I was thinking in a lazy language (Haskell) whilst looking at code in a strict one. -- Ben.
[toc] | [prev] | [next] | [standalone]
Page 1 of 4 [1] 2 3 4 Next page →
Back to top | Article view | comp.lang.c
csiph-web