Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #9505 > unrolled thread

None versus MISSING sentinel -- request for design feedback

Started bySteven D'Aprano <steve+comp.lang.python@pearwood.info>
First post2011-07-15 15:28 +1000
Last post2011-07-15 11:02 -0600
Articles 3 on this page of 23 — 12 participants

Back to article view | Back to comp.lang.python


Contents

  None versus MISSING sentinel -- request for design feedback Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-15 15:28 +1000
    Re: None versus MISSING sentinel -- request for design feedback Chris Angelico <rosuav@gmail.com> - 2011-07-15 16:08 +1000
      Re: None versus MISSING sentinel -- request for design feedback "bruno.desthuilliers@gmail.com" <bruno.desthuilliers@gmail.com> - 2011-07-15 00:53 -0700
      Re: None versus MISSING sentinel -- request for design feedback Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-15 20:46 +1000
        Re: None versus MISSING sentinel -- request for design feedback Chris Angelico <rosuav@gmail.com> - 2011-07-15 21:04 +1000
          Re: None versus MISSING sentinel -- request for design feedback Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-15 22:48 +1000
            Re: None versus MISSING sentinel -- request for design feedback Mel <mwilson@the-wire.com> - 2011-07-15 09:16 -0400
              Re: None versus MISSING sentinel -- request for design feedback Ethan Furman <ethan@stoneleaf.us> - 2011-07-15 10:18 -0700
                Re: None versus MISSING sentinel -- request for design feedback Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-07-16 14:24 +1200
                  Re: None versus MISSING sentinel -- request for design feedback Ethan Furman <ethan@stoneleaf.us> - 2011-07-16 17:31 -0700
    Re: None versus MISSING sentinel -- request for design feedback Rob Williscroft <rtw@rtw.me.uk> - 2011-07-15 07:43 +0000
      Re: None versus MISSING sentinel -- request for design feedback Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-15 20:19 +1000
        Re: None versus MISSING sentinel -- request for design feedback "OKB (not okblacke)" <brenNOSPAMbarn@NObrenSPAMbarn.net> - 2011-07-15 17:40 +0000
        Re: None versus MISSING sentinel -- request for design feedback Terry Reedy <tjreedy@udel.edu> - 2011-07-15 17:35 -0400
    Re: None versus MISSING sentinel -- request for design feedback Cameron Simpson <cs@zip.com.au> - 2011-07-15 17:44 +1000
      Re: None versus MISSING sentinel -- request for design feedback "bruno.desthuilliers@gmail.com" <bruno.desthuilliers@gmail.com> - 2011-07-15 02:58 -0700
      Re: None versus MISSING sentinel -- request for design feedback Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-15 20:17 +1000
        Re: None versus MISSING sentinel -- request for design feedback Cameron Simpson <cs@zip.com.au> - 2011-07-15 20:38 +1000
    Re: None versus MISSING sentinel -- request for design feedback "bruno.desthuilliers@gmail.com" <bruno.desthuilliers@gmail.com> - 2011-07-15 00:59 -0700
    Re: None versus MISSING sentinel -- request for design feedback Teemu Likonen <tlikonen@iki.fi> - 2011-07-15 11:28 +0300
      Re: None versus MISSING sentinel -- request for design feedback "bruno.desthuilliers@gmail.com" <bruno.desthuilliers@gmail.com> - 2011-07-15 03:02 -0700
        Re: None versus MISSING sentinel -- request for design feedback Teemu Likonen <tlikonen@iki.fi> - 2011-07-15 13:56 +0300
    Re: None versus MISSING sentinel -- request for design feedback Eric Snow <ericsnowcurrently@gmail.com> - 2011-07-15 11:02 -0600

Page 2 of 2 — ← Prev page 1 [2]


#9531

From"bruno.desthuilliers@gmail.com" <bruno.desthuilliers@gmail.com>
Date2011-07-15 03:02 -0700
Message-ID<7f96b52b-cb4f-4c13-9cff-aad7e0db00a0@e7g2000vbj.googlegroups.com>
In reply to#9527
On Jul 15, 10:28 am, Teemu Likonen <tliko...@iki.fi> wrote:
>
> How about accepting anything but ignoring all non-numbers?

Totally unpythonic. Better to be explicit about what you expect and
crash as loudly as possible when you get anything unexpected.

[toc] | [prev] | [next] | [standalone]


#9536

FromTeemu Likonen <tlikonen@iki.fi>
Date2011-07-15 13:56 +0300
Message-ID<87ipr34xsx.fsf@mithlond.arda>
In reply to#9531
* 2011-07-15T03:02:11-07:00 * bruno wrote:

> On Jul 15, 10:28 am, Teemu Likonen <tliko...@iki.fi> wrote:
>> How about accepting anything but ignoring all non-numbers?
>
> Totally unpythonic. Better to be explicit about what you expect and
> crash as loudly as possible when you get anything unexpected.

Sure, but sometimes an API can be "accept anything" if any kind of trash
is expected. But it seems that not in this case, so you're right.

[toc] | [prev] | [next] | [standalone]


#9556

FromEric Snow <ericsnowcurrently@gmail.com>
Date2011-07-15 11:02 -0600
Message-ID<mailman.1070.1310749330.1164.python-list@python.org>
In reply to#9505
On Thu, Jul 14, 2011 at 11:28 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Hello folks,
>
> I'm designing an API for some lightweight calculator-like statistics
> functions, such as mean, standard deviation, etc., and I want to support
> missing values. Missing values should be just ignored. E.g.:
>
> mean([1, 2, MISSING, 3]) => 6/3 = 2 rather than 6/4 or raising an error.
>
> My question is, should I accept None as the missing value, or a dedicated
> singleton?
>
> In favour of None: it's already there, no extra code required. People may
> expect it to work.
>
> Against None: it's too easy to mistakenly add None to a data set by mistake,
> because functions return None by default.

Good point.

>
> In favour of a dedicated MISSING singleton: it's obvious from context. It's
> not a lot of work to implement compared to using None. Hard to accidentally
> include it by mistake. If None does creep into the data by accident, you
> get a nice explicit exception.

Also good points.

>
> Against MISSING: users may expect to be able to choose their own sentinel by
> assigning to MISSING. I don't want to support that.
>
>
> I've considered what other packages do:-
>
> R uses a special value, NA, to stand in for missing values. This is more or
> less the model I wish to follow.
>
> I believe that MATLAB treats float NANs as missing values. I consider this
> an abuse of NANs and I won't be supporting that :-P

I was just thinking of this.  :)

>
> Spreadsheets such as Excel, OpenOffice and Gnumeric generally ignore blank
> cells, and give you a choice between ignoring text and treating it as zero.
> E.g. with cells set to [1, 2, "spam", 3] the AVERAGE function returns 2 and
> the AVERAGEA function returns 1.5.
>
> numpy uses masked arrays, which is probably over-kill for my purposes; I am
> gratified to see it doesn't abuse NANs:
>
>>>> import numpy as np
>>>> a = np.array([1, 2, float('nan'), 3])
>>>> np.mean(a)
> nan
>
> numpy also treats None as an error:
>
>>>> a = np.array([1, 2, None, 3])
>>>> np.mean(a)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "/usr/lib/python2.5/site-packages/numpy/core/fromnumeric.py", line
> 860, in mean
>    return mean(axis, dtype, out)
> TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
>
>
> I would appreciate any comments, advice or suggestions.
>

Too bad there isn't a good way to "freeze" a name, i.e. indicate that
any attempt to rebind it is an exception.  Trying to rebind None is a
SyntaxError, but a NameError or something would be fine.  Then the
downside of using your own sentinel here goes away.

In reality, using Missing may be your best bet anyway.  If there were
a convention for indicating a name should not be re-bound (like a
single leading underscore indicates "private"), you could use that
(all caps?).  Since "we're all consenting adults" it would probably be
good enough to make sure others know that Missing should not be
re-bound...

I might have said to use NotImplemented instead of None, but it can be
re-bound and the name isn't as helpful for your use case.

Another solution, perhaps ugly or confusing, is to use something like
two underscores as the name for your sentinel:

mean([1, 2, __, 3])

Still it seems like using Missing (or whatever) would be better than None.

-eric

>
> --
> Steven
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web