Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #101027 > unrolled thread
| Started by | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| First post | 2015-12-31 11:09 +1100 |
| Last post | 2015-12-31 17:17 +0100 |
| Articles | 20 on this page of 41 — 14 participants |
Back to article view | Back to comp.lang.python
raise None Steven D'Aprano <steve@pearwood.info> - 2015-12-31 11:09 +1100
Re: raise None Paul Rubin <no.email@nospam.invalid> - 2015-12-30 16:19 -0800
Validation in Python (was: raise None) Ben Finney <ben+python@benfinney.id.au> - 2015-12-31 11:26 +1100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2015-12-31 11:38 +1100
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2015-12-31 12:26 +1100
Re: raise None Ben Finney <ben+python@benfinney.id.au> - 2015-12-31 12:44 +1100
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2015-12-31 15:07 +1100
Re: raise None Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-31 12:19 +0000
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2016-01-01 02:35 +1100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-01 02:53 +1100
Re: raise None Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-31 16:46 +0000
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2016-01-01 04:50 +1100
Re: raise None "Martin A. Brown" <martin@linux-ip.net> - 2015-12-31 09:30 -0800
Re: raise None Ben Finney <ben+python@benfinney.id.au> - 2016-01-01 07:18 +1100
Re: raise None Johannes Bauer <dfnsonfsduifb@gmx.de> - 2016-01-02 12:47 +0100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-01 09:48 +1100
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2016-01-04 16:19 +1100
Re: raise None Dan Sommers <dan@tombstonezero.net> - 2016-01-04 06:09 +0000
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-03 22:39 -0800
Re: raise None Ben Finney <ben+python@benfinney.id.au> - 2016-01-01 10:27 +1100
Re: raise None Marko Rauhamaa <marko@pacujo.net> - 2016-01-01 02:29 +0200
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2016-01-04 16:19 +1100
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-03 21:53 -0800
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-04 03:55 -0800
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-03 21:53 -0800
Re: raise None Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-31 23:36 +0000
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-01 10:39 +1100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-01 10:41 +1100
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-03 19:04 -0800
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-04 14:31 +1100
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2016-01-04 14:48 +1100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2016-01-04 14:56 +1100
Re: raise None Rustom Mody <rustompmody@gmail.com> - 2016-01-03 20:46 -0800
Re: raise None Christian Gollwitzer <auriocus@gmx.de> - 2016-01-04 08:28 +0100
Re: raise None Chris Angelico <rosuav@gmail.com> - 2015-12-31 13:12 +1100
Re: raise None Cameron Simpson <cs@zip.com.au> - 2015-12-31 15:03 +1100
Re: raise None Steven D'Aprano <steve@pearwood.info> - 2015-12-31 16:12 +1100
Re: raise None Cameron Simpson <cs@zip.com.au> - 2015-12-31 16:45 +1100
Re: raise None Terry Reedy <tjreedy@udel.edu> - 2015-12-30 23:00 -0500
Re: raise None Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-31 15:58 +0000
Re: raise None Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-12-31 17:17 +0100
Page 1 of 3 [1] 2 3 Next page →
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-12-31 11:09 +1100 |
| Subject | raise None |
| Message-ID | <56847239$0$1590$c3e8da3$5496439d@news.astraweb.com> |
I have a lot of functions that perform the same argument checking each time:
def spam(a, b):
if condition(a) or condition(b): raise TypeError
if other_condition(a) or something_else(b): raise ValueError
if whatever(a): raise SomethingError
...
def eggs(a, b):
if condition(a) or condition(b): raise TypeError
if other_condition(a) or something_else(b): raise ValueError
if whatever(a): raise SomethingError
...
Since the code is repeated, I naturally pull it out into a function:
def _validate(a, b):
if condition(a) or condition(b): raise TypeError
if other_condition(a) or something_else(b): raise ValueError
if whatever(a): raise SomethingError
def spam(a, b):
_validate(a, b)
...
def eggs(a, b):
_validate(a, b)
...
But when the argument checking fails, the traceback shows the error
occurring in _validate, not eggs or spam. (Naturally, since that is where
the exception is raised.) That makes the traceback more confusing than it
need be.
So I can change the raise to return in the _validate function:
def _validate(a, b):
if condition(a) or condition(b): return TypeError
if other_condition(a) or something_else(b): return ValueError
if whatever(a): return SomethingError
and then write spam and eggs like this:
def spam(a, b):
ex = _validate(a, b)
if ex is not None: raise ex
...
It's not much of a gain though. I save an irrelevant level in the traceback,
but only at the cost of an extra line of code everywhere I call the
argument checking function.
But suppose we allowed "raise None" to do nothing. Then I could rename
_validate to _if_error and write this:
def spam(a, b):
raise _if_error(a, b)
...
and have the benefits of "Don't Repeat Yourself" without the unnecessary,
and misleading, extra level in the traceback.
Obviously this doesn't work now, since raise None is an error, but if it did
work, what do you think?
--
Steven
[toc] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-12-30 16:19 -0800 |
| Message-ID | <87d1tnxydx.fsf@jester.gateway.pace.com> |
| In reply to | #101027 |
Steven D'Aprano <steve@pearwood.info> writes: > def _validate(a, b): > if condition(a) or condition(b): return TypeError > ... > Obviously this doesn't work now, since raise None is an error, but if it did > work, what do you think? Never occurred to me. But in some analogous situations I've caught the exception inside _validate, then peeled away some layers of the traceback from the exception output before throwing again.
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2015-12-31 11:26 +1100 |
| Subject | Validation in Python (was: raise None) |
| Message-ID | <mailman.94.1451521806.11925.python-list@python.org> |
| In reply to | #101027 |
Steven D'Aprano <steve@pearwood.info> writes:
> I have a lot of functions that perform the same argument checking each
> time:
Not an answer to the question you ask, but: Have you tried the data
validation library “voluptuous”?
Voluptuous, despite the name, is a Python data validation library.
It is primarily intended for validating data coming into Python as
JSON, YAML, etc.
It has three goals:
Simplicity.
Support for complex data structures.
Provide useful error messages.
<URL:https://pypi.python.org/pypi/voluptuous/>
Seems like a good way to follow Don't Repeat Yourself in code that needs
a lot of validation of inputs.
--
\ “It's my belief we developed language because of our deep inner |
`\ need to complain.” —Jane Wagner, via Lily Tomlin |
_o__) |
Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-12-31 11:38 +1100 |
| Message-ID | <mailman.96.1451522295.11925.python-list@python.org> |
| In reply to | #101027 |
On Thu, Dec 31, 2015 at 11:09 AM, Steven D'Aprano <steve@pearwood.info> wrote: > I have a lot of functions that perform the same argument checking each time: > > def spam(a, b): > if condition(a) or condition(b): raise TypeError > if other_condition(a) or something_else(b): raise ValueError > if whatever(a): raise SomethingError > ... > > def eggs(a, b): > if condition(a) or condition(b): raise TypeError > if other_condition(a) or something_else(b): raise ValueError > if whatever(a): raise SomethingError > ... > > > Since the code is repeated, I naturally pull it out into a function: > > def _validate(a, b): > if condition(a) or condition(b): raise TypeError > if other_condition(a) or something_else(b): raise ValueError > if whatever(a): raise SomethingError > > def spam(a, b): > _validate(a, b) > ... > > def eggs(a, b): > _validate(a, b) > ... > > > But when the argument checking fails, the traceback shows the error > occurring in _validate, not eggs or spam. (Naturally, since that is where > the exception is raised.) That makes the traceback more confusing than it > need be. If the validation really is the same in all of them, then is it a problem to see the validation function in the traceback? Its purpose isn't simply "raise an exception", but "validate a specific set of inputs". That sounds like a perfectly reasonable traceback line to me (imagine if your validation function has a bug). ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-12-31 12:26 +1100 |
| Message-ID | <5684842a$0$1596$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101032 |
On Thu, 31 Dec 2015 11:38 am, Chris Angelico wrote: > On Thu, Dec 31, 2015 at 11:09 AM, Steven D'Aprano <steve@pearwood.info> > wrote: >> I have a lot of functions that perform the same argument checking each >> time: >> >> def spam(a, b): >> if condition(a) or condition(b): raise TypeError >> if other_condition(a) or something_else(b): raise ValueError >> if whatever(a): raise SomethingError >> ... >> >> def eggs(a, b): >> if condition(a) or condition(b): raise TypeError >> if other_condition(a) or something_else(b): raise ValueError >> if whatever(a): raise SomethingError >> ... >> >> >> Since the code is repeated, I naturally pull it out into a function: >> >> def _validate(a, b): >> if condition(a) or condition(b): raise TypeError >> if other_condition(a) or something_else(b): raise ValueError >> if whatever(a): raise SomethingError >> >> def spam(a, b): >> _validate(a, b) >> ... >> >> def eggs(a, b): >> _validate(a, b) >> ... >> >> >> But when the argument checking fails, the traceback shows the error >> occurring in _validate, not eggs or spam. (Naturally, since that is where >> the exception is raised.) That makes the traceback more confusing than it >> need be. > > If the validation really is the same in all of them, then is it a > problem to see the validation function in the traceback? Its purpose > isn't simply "raise an exception", but "validate a specific set of > inputs". That sounds like a perfectly reasonable traceback line to me > (imagine if your validation function has a bug). Right -- that's *exactly* why it is harmful that the _validate function shows up in the traceback. If _validate itself has a bug, then it will raise, and you will see the traceback: Traceback (most recent call last): File "spam", line 19, in this File "spam", line 29, in that File "spam", line 39, in other File "spam", line 5, in _validate ThingyError: ... which tells you that _validate raised an exception and therefore has a bug. Whereas if _validate does what it is supposed to do, and is working correctly, you will see: Traceback (most recent call last): File "spam", line 19, in this File "spam", line 29, in that File "spam", line 39, in other File "spam", line 5, in _validate ThingyError: ... and the reader has to understand the internal workings of _validate sufficiently to infer that this exception is not a bug in _validate but an expected failure mode of other when you pass a bad argument. Now obviously one can do that. It's often not even very hard: most bugs are obviously bugs, and the ThingyError will surely come with a descriptive error message like "Argument out of range" in the second case. In the case where _validate *returns* the exception instead of raising it, and the calling function (in this case other) raises, you see this in the case of a bug in _validate: Traceback (most recent call last): File "spam", line 19, in this File "spam", line 29, in that File "spam", line 39, in other File "spam", line 5, in _validate ThingyError: ... and this is the case of a bad argument to other: Traceback (most recent call last): File "spam", line 19, in this File "spam", line 29, in that File "spam", line 39, in other ThingyError: ... I think this is a win for debuggability. (Is that a word?) But it's a bit annoying to do it today, since you have to save the return result and explicitly compare it to None. If "raise None" was a no-op, it would feel more natural to just say raise _validate() and trust that if _validate falls out the end and returns None, the raise will be a no-op. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2015-12-31 12:44 +1100 |
| Message-ID | <mailman.97.1451526250.11925.python-list@python.org> |
| In reply to | #101033 |
Steven D'Aprano <steve@pearwood.info> writes: > Traceback (most recent call last): > File "spam", line 19, in this > File "spam", line 29, in that > File "spam", line 39, in other > File "spam", line 5, in _validate > ThingyError: ... > > and the reader has to understand the internal workings of _validate > sufficiently to infer that this exception is not a bug in _validate > but an expected failure mode of other when you pass a bad argument. This point seems to advocate for suppressing *any* code that deliberately raises an exception. Is that your intent? -- \ “I don't want to live peacefully with difficult realities, and | `\ I see no virtue in savoring excuses for avoiding a search for | _o__) real answers.” —Paul Z. Myers, 2009-09-12 | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-12-31 15:07 +1100 |
| Message-ID | <5684aa1a$0$1602$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101034 |
On Thu, 31 Dec 2015 12:44 pm, Ben Finney wrote: > Steven D'Aprano <steve@pearwood.info> writes: > >> Traceback (most recent call last): >> File "spam", line 19, in this >> File "spam", line 29, in that >> File "spam", line 39, in other >> File "spam", line 5, in _validate >> ThingyError: ... >> >> and the reader has to understand the internal workings of _validate >> sufficiently to infer that this exception is not a bug in _validate >> but an expected failure mode of other when you pass a bad argument. > > This point seems to advocate for suppressing *any* code that > deliberately raises an exception. Is that your intent? No. The issue isn't that an exception is deliberately raised. The issue is that it is deliberately raised in a function separate from where the exception conceptually belongs. The exception is conceptually part of function "other", and was only refactored into a separate function _validate to avoid repeating the same validation code in multiple places. It is a mere implementation detail that the exception is actually raised inside _validate rather than other. As an implementation detail, exposing it to the user (in the form of a line in the stacktrace) doesn't help debugging. At best it is neutral (the user reads the error message and immediately realises that the problem lies with bad arguments passed to other, and _validate has nothing to do with it). At worst it actively misleads the user into thinking that there is a bug in _validate. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2015-12-31 12:19 +0000 |
| Message-ID | <mailman.107.1451564424.11925.python-list@python.org> |
| In reply to | #101039 |
On 31 December 2015 at 04:07, Steven D'Aprano <steve@pearwood.info> wrote:
> On Thu, 31 Dec 2015 12:44 pm, Ben Finney wrote:
>
>> Steven D'Aprano <steve@pearwood.info> writes:
>>
>>> Traceback (most recent call last):
>>> File "spam", line 19, in this
>>> File "spam", line 29, in that
>>> File "spam", line 39, in other
>>> File "spam", line 5, in _validate
>>> ThingyError: ...
>>>
>>> and the reader has to understand the internal workings of _validate
>>> sufficiently to infer that this exception is not a bug in _validate
>>> but an expected failure mode of other when you pass a bad argument.
>>
>> This point seems to advocate for suppressing *any* code that
>> deliberately raises an exception. Is that your intent?
>
> No. The issue isn't that an exception is deliberately raised. The issue is
> that it is deliberately raised in a function separate from where the
> exception conceptually belongs. The exception is conceptually part of
> function "other", and was only refactored into a separate function
> _validate to avoid repeating the same validation code in multiple places.
> It is a mere implementation detail that the exception is actually raised
> inside _validate rather than other.
>
> As an implementation detail, exposing it to the user (in the form of a line
> in the stacktrace) doesn't help debugging. At best it is neutral (the user
> reads the error message and immediately realises that the problem lies with
> bad arguments passed to other, and _validate has nothing to do with it). At
> worst it actively misleads the user into thinking that there is a bug in
> _validate.
You're overthinking this. It's fine for the error to come from
_validate. Conceptually the real error is not in _validate or the
function that calls _validate but in whatever function further up the
stack trace created the wrong type of object to pass in. If the user
can see the stack trace and work back to the point where they passed
something in to your function then how does the extra level hurt?
If it really bothers you then you can use a comment that will show up
in the traceback output
_validate(a, b) # Verify arguments to myfunc(a, b)
but really I don't think it's a big deal. The traceback gives you
useful information about where to look for an error/bug but it's still
the programmer's job to interpret that, look at the code, and try to
understand what they have done to cause the problem.
--
Oscar
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-01-01 02:35 +1100 |
| Message-ID | <56854b49$0$1615$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101053 |
On Thu, 31 Dec 2015 11:19 pm, Oscar Benjamin wrote: > On 31 December 2015 at 04:07, Steven D'Aprano <steve@pearwood.info> wrote: [...] >> As an implementation detail, exposing it to the user (in the form of a >> line in the stacktrace) doesn't help debugging. At best it is neutral >> (the user reads the error message and immediately realises that the >> problem lies with bad arguments passed to other, and _validate has >> nothing to do with it). At worst it actively misleads the user into >> thinking that there is a bug in _validate. > > You're overthinking this. Maybe. As I have suggested a number of times now, I'm aware that this is just a marginal issue. But I think it is a real issue. I believe in beautiful tracebacks that give you just the right amount of information, neither too little nor two much. Debugging is hard enough with being given more information than you need and having to decide what bits to ignore and which are important. (Aside: does anyone else hate the tracebacks given by PyCharm? We've had a number of people posting traceback of errors from PyCharm recently, and in my opinion they drown you in irrelevant detail.) > It's fine for the error to come from > _validate. Conceptually the real error is not in _validate or the > function that calls _validate but in whatever function further up the > stack trace created the wrong type of object to pass in. That may be so, but that could be *anywhere* in the call chain. The ultimate cause of the error may not even appear in the call chain. The principle is that errors should be raised as close to their cause as possible. If I call spam(a, b) and provide bad arguments, the earliest I can possibly detect that is in spam. (Only spam knows what it accepts as arguments.) Any additional levels beyond spam (like _validate) is moving further away: File "spam", line 19, in this File "spam", line 29, in that <--- where the error really lies File "spam", line 39, in other File "spam", line 89, in spam <--- the first place we could detect it File "spam", line 5, in _validate <--- where we actually detect it > If the user > can see the stack trace and work back to the point where they passed > something in to your function then how does the extra level hurt? It hurts precisely because it is one extra level. I acknowledge that it is *only* one extra level. (I told you this was a marginal benefit.) If one extra level is okay, might two extra be okay? How about three? What about thirty? Where would you draw the line? > If it really bothers you then you can use a comment that will show up > in the traceback output > > _validate(a, b) # Verify arguments to myfunc(a, b) No, that can't work. (Aside from the fact that in the most general case, the source code may no longer be available to read.) The whole point of moving the validation code into a function was to share it between a number of functions. > but really I don't think it's a big deal. The traceback gives you > useful information about where to look for an error/bug but it's still > the programmer's job to interpret that, look at the code, and try to > understand what they have done to cause the problem. Sure. And I believe that this technique will make the programmer's job just a little bit easier. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2016-01-01 02:53 +1100 |
| Message-ID | <mailman.108.1451577203.11925.python-list@python.org> |
| In reply to | #101058 |
On Fri, Jan 1, 2016 at 2:35 AM, Steven D'Aprano <steve@pearwood.info> wrote: >> If the user >> can see the stack trace and work back to the point where they passed >> something in to your function then how does the extra level hurt? > > It hurts precisely because it is one extra level. I acknowledge that it is > *only* one extra level. (I told you this was a marginal benefit.) > > If one extra level is okay, might two extra be okay? How about three? What > about thirty? Where would you draw the line? > It becomes something to get used to when you work with a particular library. Several of my students have run into this with matplotlib or sklearn; you make a mistake with a parameter to function X, which just takes that as-is and passes it to function Y, which does some manipulation but doesn't trip the error, and then calls through to function Z, which notices that one parameter doesn't match another, and raises an exception. You get used to scrolling way up to find the actual cause of the error. Whether that supports or contradicts your point, I'm not sure. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2015-12-31 16:46 +0000 |
| Message-ID | <mailman.111.1451580417.11925.python-list@python.org> |
| In reply to | #101058 |
On 31 Dec 2015 15:54, "Chris Angelico" <rosuav@gmail.com> wrote: > > On Fri, Jan 1, 2016 at 2:35 AM, Steven D'Aprano <steve@pearwood.info> wrote: > >> If the user > >> can see the stack trace and work back to the point where they passed > >> something in to your function then how does the extra level hurt? > > > > It hurts precisely because it is one extra level. I acknowledge that it is > > *only* one extra level. (I told you this was a marginal benefit.) > > > > If one extra level is okay, might two extra be okay? How about three? What > > about thirty? Where would you draw the line? > > > > It becomes something to get used to when you work with a particular > library. Several of my students have run into this with matplotlib or > sklearn; you make a mistake with a parameter to function X, which just > takes that as-is and passes it to function Y, which does some > manipulation but doesn't trip the error, and then calls through to > function Z, which notices that one parameter doesn't match another, > and raises an exception. You get used to scrolling way up to find the > actual cause of the error Exactly. The critical technique is looking at the traceback and splitting it between what's your code and what's someone else's. Hopefully you don't need to look at steves_library.py to figure out what you did wrong. However if you do need to look at Steve's code you're now stumped because he's hidden the actual line that raises. All you know now is that somewhere in _validate the raise happened. Why hide that piece of information and complicate the general interpretation of stack traces? Actually matplotlib is a particularly tricky case as often the arguments you pass or stored and not accessed until later. So the traceback shows an error in the call to show() rather than e.g. legend(). Usually I can glean pretty quickly that e.g. the legend labels are at fault though from the traceback though. -- Oscar
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-01-01 04:50 +1100 |
| Message-ID | <56856b01$0$1597$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101065 |
On Fri, 1 Jan 2016 03:46 am, Oscar Benjamin wrote:
[...]
> Exactly. The critical technique is looking at the traceback and splitting
> it between what's your code and what's someone else's. Hopefully you don't
> need to look at steves_library.py to figure out what you did wrong.
> However if you do need to look at Steve's code you're now stumped because
> he's hidden the actual line that raises. All you know now is that
> somewhere in _validate the raise happened. Why hide that piece of
> information and complicate the general interpretation of stack traces?
No. I don't hide anything. Here's a simple example, minus any hypothetical
new syntax, showing the traditional way and the non-traditional way.
# example.py
def _validate(arg):
if not isinstance(arg, int):
# traditional error handling: raise in the validation function
raise TypeError('expected an int')
if arg < 0:
# non-traditional: return and raise in the caller
return ValueError('argument must be non-negative')
def func(x):
exc = _validate(x)
if exc is not None:
raise exc
print(x+1)
def main():
value = None # on the second run, edit this to be -1
func(value)
main()
And here's the traceback you get in each case. First, the traditional way,
raising directly inside _validate:
[steve@ando tmp]$ python example.py
Traceback (most recent call last):
File "example.py", line 17, in <module>
main()
File "example.py", line 15, in main
func(value)
File "example.py", line 8, in func
exc = _validate(x)
File "example.py", line 3, in _validate
raise TypeError('expected an int')
TypeError: expected an int
What do we see? Firstly, the emphasis is on the final call to _validate,
where the exception is actually raised. (As it should be, in the general
case where the exception is an error.) If you're like me, you're used to
skimming the traceback until you get to the last entry, which in this case
is:
File "example.py", line 3, in _validate
and starting to investigate there. But that's a red herring, because
although the exception is raised there, that's not where the error lies.
_validate is pretty much just boring boilerplate that validates the
arguments -- where we really want to start looking is the previous entry,
func, and work backwards from there.
The second thing we see is that the displayed source code for _validate is
entirely redundant:
raise TypeError('expected an int')
gives us *nothing* we don't see from the exception itself:
TypeError: expected an int
This is a pretty simple exception. In a more realistic example, with a
longer and more detailed message, you might see something like this as the
source extract:
raise TypeError(msg)
where the message is set up in the previous line or lines. This is even less
useful to read.
So it is my argument that the traditional way of refactoring parameter
checks, where exceptions are raised in the _validate function, is
sub-optimal. We can do better.
Here's the traceback we get from the non-traditional error handling. I edit
the file to change the value = None line to value = -1 and re-run it:
[steve@ando tmp]$ python example.py
Traceback (most recent call last):
File "example.py", line 17, in <module>
main()
File "example.py", line 15, in main
func(value)
File "example.py", line 10, in func
raise exc
ValueError: argument must be non-negative
Nothing is hidden. We still see the descriptive exception and error message,
and the line
raise exc
is no worse than "raise TypeError(msg)" -- all the detail we need is
immediately below it.
The emphasis here is on the call to func, since that's the last entry in the
call stack. The advantage is that we don't see the irrelevant call to
_validate *unless we go looking for it in the source code*. We start our
investigate where we need to start, namely in func itself.
Of course, none of this is mandatory, nor is it new. Although I haven't
tried it, I'm sure that this would work as far back as Python 1.5, since
exceptions are first-class values that can be passed around and raised when
required. It's entirely up to the developer to choose whether this
non-traditional idiom makes sense for their functions or not. Sometimes it
will, and sometimes it won't.
The only new part here is the idea that we could streamline the code in the
caller if "raise None" was a no-op. Instead of writing this:
exc = _validate(x)
if exc is not None:
raise exc
we could write:
raise _validate(x)
which would make this idiom more attractive.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | "Martin A. Brown" <martin@linux-ip.net> |
|---|---|
| Date | 2015-12-31 09:30 -0800 |
| Message-ID | <mailman.112.1451583026.11925.python-list@python.org> |
| In reply to | #101058 |
Hi there,
>>> At worst it actively misleads the user into thinking that there
>>> is a bug in _validate.
Is this "user" a software user or another programmer?
If a software user, then some hint about why the _validate found
unacceptable data might benefit the user's ability to adjust inputs
to the program.
If another programmer, then that person should be able to figure it
out with the full trace. Probably it's not a bug in _validate, but
....it could be. So, it could be a disservice to the diagnostician
to exempt the _validate function from suspicion. Thus, I'd want to
see _validate in the stack trace.
>Maybe. As I have suggested a number of times now, I'm aware that
>this is just a marginal issue.
>
>But I think it is a real issue. I believe in beautiful tracebacks
>that give you just the right amount of information, neither too
>little nor two much. Debugging is hard enough with being given more
>information than you need and having to decide what bits to ignore
>and which are important.
I agree about tracebacks that provide the right amount of
information. If I were a programmer working with the code you are
describingi, I would like to know in any traceback that the failed
comparisons (which implement some sort of business logic or sanity
checking) occurred in the _validate function.
In any software system beyond the simplest, code/data tracing would
be required to figure out where the bad data originated.
Since Python allows us to provide ancillary text to any exception,
you could always provide a fuller explanation of the validation
failure. And, while you are at it, you could add the calling
function name to the text to point the programmer faster toward the
probable issue.
Adding one optional parameter to _validate (defaulting to the
caller's function name) would allow you to point the way to a
diagnostician. Here's a _validate function I made up with two silly
comparision tests--where a must be greater than b and both a and b
must not be convertible to integers.
def _validate(a, b, func=None):
if not func:
func = sys._getframe(1).f_code.co_name
if a >= b:
raise ValueError("a cannot be larger than b in " + func)
if a == int(a) or b == int(b):
raise TypeError("a, b must not be convertible to int in " + func)
My main point is less about identifying the calling function or its
calling function, but rather to observe that arbitrary text can be
used. This should help the poor sap (who is, invariably, diagnosing
the problem at 03:00) realize that the function _validate is not the
problem.
>The principle is that errors should be raised as close to their
>cause as possible. If I call spam(a, b) and provide bad arguments,
>the earliest I can possibly detect that is in spam. (Only spam
>knows what it accepts as arguments.) Any additional levels beyond
>spam (like _validate) is moving further away:
>
> File "spam", line 19, in this
> File "spam", line 29, in that <--- where the error really lies
> File "spam", line 39, in other
> File "spam", line 89, in spam <--- the first place we could detect it
> File "spam", line 5, in _validate <--- where we actually detect it
Yes, indeed! Our stock in trade. I never liked function 'that'. I
much prefer function 'this'.
-Martin
Q: Who is Snow White's brother?
A: Egg white. Get the yolk?
--
Martin A. Brown
http://linux-ip.net/
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-01-01 07:18 +1100 |
| Message-ID | <mailman.122.1451593111.11925.python-list@python.org> |
| In reply to | #101058 |
Oscar Benjamin <oscar.j.benjamin@gmail.com> writes: > Exactly. The critical technique is looking at the traceback and > splitting it between what's your code and what's someone else's. > Hopefully you don't need to look at steves_library.py to figure out > what you did wrong. However if you do need to look at Steve's code > you're now stumped because he's hidden the actual line that raises. +1. As best I can tell, Steven is advocating a way to obscure information from the traceback, on the assumption the writer of a library knows that I don't want to see it. Given how very often such decisions make my debugging tasks needlessly difficult, I'm not seeing how that's a desirable feature. -- \ “Firmness in decision is often merely a form of stupidity. It | `\ indicates an inability to think the same thing out twice.” | _o__) —Henry L. Mencken | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Johannes Bauer <dfnsonfsduifb@gmx.de> |
|---|---|
| Date | 2016-01-02 12:47 +0100 |
| Message-ID | <n68dd1$noh$1@news.albasani.net> |
| In reply to | #101080 |
On 31.12.2015 21:18, Ben Finney wrote: > As best I can tell, Steven is advocating a way to obscure information > from the traceback, on the assumption the writer of a library knows that > I don't want to see it. How do you arrive at that conclusion? The line that raises the exception is exactly the line that you would expect the exception to be raised. I.e., the one containing the "raise" statement. What you seem to advocate against is a feature that is ALREADY part of the language, i.e. raising exceptions by reference to a variable, not constructing them on-the-go. Your argumentation makes therefore no sense in this context. Cheers, Johannes -- >> Wo hattest Du das Beben nochmal GENAU vorhergesagt? > Zumindest nicht öffentlich! Ah, der neueste und bis heute genialste Streich unsere großen Kosmologen: Die Geheim-Vorhersage. - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2016-01-01 09:48 +1100 |
| Message-ID | <mailman.125.1451602117.11925.python-list@python.org> |
| In reply to | #101058 |
On Fri, Jan 1, 2016 at 7:18 AM, Ben Finney <ben+python@benfinney.id.au> wrote:
> Oscar Benjamin <oscar.j.benjamin@gmail.com> writes:
>
>> Exactly. The critical technique is looking at the traceback and
>> splitting it between what's your code and what's someone else's.
>> Hopefully you don't need to look at steves_library.py to figure out
>> what you did wrong. However if you do need to look at Steve's code
>> you're now stumped because he's hidden the actual line that raises.
>
> +1.
>
> As best I can tell, Steven is advocating a way to obscure information
> from the traceback, on the assumption the writer of a library knows that
> I don't want to see it.
>
> Given how very often such decisions make my debugging tasks needlessly
> difficult, I'm not seeing how that's a desirable feature.
What Steven's actually advocating is removing a difference between
Python code and native code. Compare:
>>> class Integer:
... def __add__(self, other):
... if isinstance(other, list):
... raise TypeError("unsupported operand type(s) for +:
'Integer' and 'list'")
... return 5
...
>>> 7 + []
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'list'
>>> Integer() + []
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in __add__
TypeError: unsupported operand type(s) for +: 'Integer' and 'list'
The default int type is implemented in native code (C in CPython, Java
in Jython, etc). If the addition of an int and something else triggers
TypeError, the last line in the traceback is the last line of Python,
which is the caller. But since Integer is implemented in Python, it
adds another line to the traceback.
Would you advocate adding lines to the first traceback saying:
File "longobject.c", line 3008, in long_add
File "longobject.c", line 1425, in CHECK_BINOP
etc? It might be useful to someone trying to debug an extension
library (or the interpreter itself). Or if it's acceptable to omit
the "uninteresting internals" from tracebacks, then why can't we
declare that some bits of Python code are uninteresting, too?
We already have the means of throwing exceptions into generators,
which "pretends" that the exception happened at that point. Why can't
we throw an exception out to the caller?
I think it's a perfectly reasonable idea, albeit only a small benefit
(and thus not worth heaps of new syntax or anything).
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2016-01-04 16:19 +1100 |
| Message-ID | <568a00f9$0$1617$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #101083 |
On Fri, 1 Jan 2016 09:48 am, Chris Angelico wrote:
> On Fri, Jan 1, 2016 at 7:18 AM, Ben Finney <ben+python@benfinney.id.au>
> wrote:
[...]
>> As best I can tell, Steven is advocating a way to obscure information
>> from the traceback, on the assumption the writer of a library knows that
>> I don't want to see it.
>>
>> Given how very often such decisions make my debugging tasks needlessly
>> difficult, I'm not seeing how that's a desirable feature.
>
> What Steven's actually advocating is removing a difference between
> Python code and native code. Compare:
Well, not quite. What I'm really doing is two-fold:
(1) reminding people that the part of the code which determines the
existence of an error need not be the part of the code which actually calls
raise; and
(2) suggesting a tiny change to the semantics of raise which would make this
idiom easier to use. (Namely, have "raise None" be a no-op.)
I'm saddened but not astonished at just how much opposition there is to
point (1), even though it is something which Python has been capable of
since the earliest 1.x days. Exceptions are first-class objects, and just
because raising an exception immediately after a test is a common idiom:
if condition:
raise SomeError('whatever')
doesn't mean it is *always* the best idiom. I have identified a common
situation in my own code where I believe that there is a better idiom. From
the reaction of others, one might think I've suggested getting rid of
exceptions altogether and replacing them with GOTO :-)
Let's step back a bit and consider what we might do if Python were a less
capable language. We might be *forced* to perform error handling via status
codes, passed from function to function as needed, until we reach the top
level of code and either print the error code or the program's intended
output. None of us want that, but maybe there are cases where a less
extreme version of the same thing is useful. Just because I detect an error
condition in one function doesn't necessarily mean I want to trigger an
exception at that point. Sometimes it is useful to delay raising the
exception.
Suppose I write a validation function that returns a status code, perhaps an
int, or a Enum:
def _validate(arg):
if condition(arg):
return CONDITION_ERROR
elif other_condition(arg):
return OTHER_ERROR
return SUCCESS
def func(x):
status = _validate(x)
if status == CONDITION_ERROR:
raise ConditionError("condition failed")
elif status == OTHER_ERROR:
raise OtherError("other condition failed")
assert status == SUCCESS
...
Don't worry about *why* I want to do this, I have my reasons. Maybe I want
to queue up a whole bunch of exceptions before doing something with them,
or conditionally decide whether or not raise, like unittest. Perhaps I can
do different sorts of processing of different status codes, including
recovery from some:
def func(x):
status = _validate(x)
if status == CONDITION_ERROR:
warnings.warn(msg)
x = massage(x)
status = SUCCESS
elif status == OTHER_ERROR:
raise SomeError("an error occurred")
assert status == SUCCESS
do_the_real_work(x)
There's no reason to say that I *must* raise an exception the instant I see
a problem.
But why am I returning a status code? This is Python, not C or bash, and I
have a much more powerful and richer set of values to work with. I can
return an error type, and an error message:
def _validate(arg):
if condition(arg):
return (ConditionError, "condition failed")
elif other_condition(arg):
return (OtherError, "something broke")
return None
But if we've come this far, why mess about with tuples when we have an
object oriented language with first class exception objects?
def _validate(arg):
if condition(arg):
return ConditionError("condition failed")
elif other_condition(arg):
return OtherError("something broke")
return None
The caller func still decides what to do with the status code, and can
process it when needed. If the error is unrecoverable, it can raise. In
that case, the source of the exception is func, not _validate. Just look at
the source code, it tells you right there where the exception comes from:
def func(x):
exc = _validate(x)
if exc is not None:
raise exc # <<<< this is the line you see in the traceback
do_the_real_work(x)
This isn't "hiding information", but it might be *information hiding*, and
it is certainly no worse than this:
def spam():
if condition:
some_long_message = "something ...".format(
many, extra, arguments)
exc = SomeError(some_long_message, data)
raise exc # <<<< this is the line you see in the traceback
If I have a lot of functions that use the same exception, I can refactor
them so that building the exception object occurs elsewhere:
def spam():
if condition:
exc = _build_exception()
raise exc # <<<< this is STILL the line you see in the traceback
and likewise for actually checking the condition:
def spam():
exc = _validate_and_build_exception()
if exc is not None:
raise exc # <<<<<<<<<<<<
Fundamentally, _validate is an implementation detail. The semantics of func
will remain unchanged whether it does error checking inside itself, or
passes it off to another helper function. The very existence of the helper
function is *irrelevant*. We have true separation of concerns:
(1) _validate decides whether some condition (nominally an error
condition) applies or not;
(2) while the caller func decides whether it can recover from
that error or needs to raise.
(Aside: remember in my use-case I'm not talking about a single caller func.
There might be dozens of them.)
If func decides it needs to raise, the fact that _validate made the decision
that the condition applies is irrelevant. The only time it is useful to see
_validate in the traceback is if _validate fails and raises an exception
itself.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Dan Sommers <dan@tombstonezero.net> |
|---|---|
| Date | 2016-01-04 06:09 +0000 |
| Message-ID | <n6d2bm$mgj$1@dont-email.me> |
| In reply to | #101233 |
On Mon, 04 Jan 2016 16:19:51 +1100, Steven D'Aprano wrote: > (1) reminding people that the part of the code which determines the > existence of an error need not be the part of the code which actually > calls raise [...] Do chained exceptions scratch your itch? I don't have experience with Python's version of chained exceptions, but I have used them in Java, and it seems to match your use case rather well. Essentially, each conceptual layer in the code effectively abstracts the details of the lower level layer(s), but preserves the details if you're really interested. (I've only been following this discussion partially; if this has been raised (pun intended) before, then just say so. Thanks.) > I'm saddened but not astonished at just how much opposition there is > to point (1) ... I'll echo the sentiment that we're all adults here, and my opinion that if you're reading tracebacks, then you want as much information as possible, even if it seemed irrelevant to the library author at the time.
[toc] | [prev] | [next] | [standalone]
| From | Rustom Mody <rustompmody@gmail.com> |
|---|---|
| Date | 2016-01-03 22:39 -0800 |
| Message-ID | <d524115e-7f18-4b6d-b51a-947f32f099ff@googlegroups.com> |
| In reply to | #101237 |
On Monday, January 4, 2016 at 11:42:51 AM UTC+5:30, Dan Sommers wrote: > > I'm saddened but not astonished at just how much opposition there is > > to point (1) ... > > I'll echo the sentiment that we're all adults here, and my opinion that > if you're reading tracebacks, then you want as much information as > possible, even if it seemed irrelevant to the library author at the > time. And I am saddened at how often mediocre Linux system software can throw a traceback at me -- sometimes for things as basic as a missing command-line parameter. Increasingly often a python traceback. I guess most people here are programmers (and adults) but sometimes we want to wear a different hat (I think). Being a vanilla user of your OS is often one such time
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-01-01 10:27 +1100 |
| Message-ID | <mailman.127.1451604459.11925.python-list@python.org> |
| In reply to | #101058 |
Chris Angelico <rosuav@gmail.com> writes: > On Fri, Jan 1, 2016 at 7:18 AM, Ben Finney <ben+python@benfinney.id.au> wrote: > > Given how very often such decisions make my debugging tasks > > needlessly difficult, I'm not seeing how that's a desirable feature. > > What Steven's actually advocating is removing a difference between > Python code and native code. Sure, but his proposal is to move in the direction of *less* debugging information. If I could have the traceback continue into the C code and tell me the line of C code that raised the exception, *that's* what I'd choose. The debugging information barrier of the C–Python boundary is a practical limitation, not a desirable one. I think those barriers should be as few as possible, and don't agree with enabling more of them. -- \ “Welchen Teil von ‘Gestalt’ verstehen Sie nicht? [What part of | `\ ‘gestalt’ don't you understand?]” —Karsten M. Self | _o__) | Ben Finney
[toc] | [prev] | [next] | [standalone]
Page 1 of 3 [1] 2 3 Next page →
Back to top | Article view | comp.lang.python
csiph-web