Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73038 > unrolled thread

None in string => TypeError?

Started byRoy Smith <roy@panix.com>
First post2014-06-09 08:34 -0700
Last post2014-06-10 04:02 +1000
Articles 15 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  None in string => TypeError? Roy Smith <roy@panix.com> - 2014-06-09 08:34 -0700
    Re: None in string => TypeError? Ryan Hiebert <ryan@ryanhiebert.com> - 2014-06-09 10:42 -0500
    Re: None in string => TypeError? Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-09 09:50 -0600
    Re: None in string => TypeError? Paul Sokolovsky <pmiscml@gmail.com> - 2014-06-09 18:57 +0300
      Re: None in string => TypeError? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-09 16:14 +0000
        Re: None in string => TypeError? Chris Angelico <rosuav@gmail.com> - 2014-06-10 02:31 +1000
    Re: None in string => TypeError? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-09 16:07 +0000
    Re: None in string => TypeError? MRAB <python@mrabarnett.plus.com> - 2014-06-09 17:06 +0100
    Re: None in string => TypeError? Shiyao Ma <i@introo.me> - 2014-06-10 00:13 +0800
    Re: None in string => TypeError? Roy Smith <roy@panix.com> - 2014-06-09 12:53 -0400
    Re: None in string => TypeError? Chris Angelico <rosuav@gmail.com> - 2014-06-10 02:59 +1000
    Re: None in string => TypeError? Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-09 11:22 -0600
    Re: None in string => TypeError? Chris Angelico <rosuav@gmail.com> - 2014-06-10 03:40 +1000
    Re: None in string => TypeError? Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-09 11:58 -0600
    Re: None in string => TypeError? Chris Angelico <rosuav@gmail.com> - 2014-06-10 04:02 +1000

#73038 — None in string => TypeError?

FromRoy Smith <roy@panix.com>
Date2014-06-09 08:34 -0700
SubjectNone in string => TypeError?
Message-ID<048960da-c132-407f-b1b3-4612a3dd7697@googlegroups.com>
We noticed recently that:

>>> None in 'foo'

raises (at least in Python 2.7)

TypeError: 'in <string>' requires string as left operand, not NoneType

This is surprising.  The description of the 'in' operatator is, 'True if an item of s is equal to x, else False	'.  From that, I would assume it behaves as if it were written:

for item in iterable:
    if item == x:
        return True
else:
    return False

why the extra type check for str.__contains__()?  That seems very unpythonic.  Duck typing, and all that.

[toc] | [next] | [standalone]


#73040

FromRyan Hiebert <ryan@ryanhiebert.com>
Date2014-06-09 10:42 -0500
Message-ID<mailman.10920.1402328536.18130.python-list@python.org>
In reply to#73038

[Multipart message — attachments visible in raw view] — view raw

On Mon, Jun 9, 2014 at 10:34 AM, Roy Smith <roy@panix.com> wrote:

> We noticed recently that:
>
> >>> None in 'foo'
>
> raises (at least in Python 2.7)
>
> TypeError: 'in <string>' requires string as left operand, not NoneType
>
> This is surprising.
>
> It's the same in 3.4, and I agree that it's surprising, at least to me
​. I don't know the story or implementation behind it, so I'll leave that
to others.​

[toc] | [prev] | [next] | [standalone]


#73042

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-06-09 09:50 -0600
Message-ID<mailman.10921.1402329049.18130.python-list@python.org>
In reply to#73038
On Mon, Jun 9, 2014 at 9:34 AM, Roy Smith <roy@panix.com> wrote:
> We noticed recently that:
>
>>>> None in 'foo'
>
> raises (at least in Python 2.7)
>
> TypeError: 'in <string>' requires string as left operand, not NoneType
>
> This is surprising.  The description of the 'in' operatator is, 'True if an item of s is equal to x, else False '.  From that, I would assume it behaves as if it were written:
>
> for item in iterable:
>     if item == x:
>         return True
> else:
>     return False
>
> why the extra type check for str.__contains__()?  That seems very unpythonic.  Duck typing, and all that.

I guess for the same reason that you get a TypeError if you test
whether the number 4 is in a string: it can't ever be, so it's a
nonsensical comparison.  It could return False, but the comparison is
more likely to be symptomatic of a bug in the code than intentional,
so it makes some noise instead.

[toc] | [prev] | [next] | [standalone]


#73043

FromPaul Sokolovsky <pmiscml@gmail.com>
Date2014-06-09 18:57 +0300
Message-ID<mailman.10922.1402329457.18130.python-list@python.org>
In reply to#73038
Hello,

On Mon, 9 Jun 2014 08:34:42 -0700 (PDT)
Roy Smith <roy@panix.com> wrote:

> We noticed recently that:
> 
> >>> None in 'foo'
> 
> raises (at least in Python 2.7)
> 
> TypeError: 'in <string>' requires string as left operand, not NoneType
> 
> This is surprising.  The description of the 'in' operatator is, 'True
> if an item of s is equal to x, else False	'.  From that, I
> would assume it behaves as if it were written:
> 
> for item in iterable:
>     if item == x:
>         return True
> else:
>     return False
> 
> why the extra type check for str.__contains__()?  That seems very
> unpythonic.  Duck typing, and all that. -- 

This is very Pythonic, Python is strictly typed language. There's no
way None could possibly be "inside" a string, so if you're trying to
look for it there, you're doing something wrong, and told so.

Also, it's not "extra check", it's "extra checks less", just consider
that "in" operator just checks types of its arguments for sanity once
at the start, and then just looks for a substring within string. You
suggest that it should check for each element type in a loop, which is
great waste, as once again, nothing but a string can be inside another
string.


-- 
Best regards,
 Paul                          mailto:pmiscml@gmail.com

[toc] | [prev] | [next] | [standalone]


#73047

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-06-09 16:14 +0000
Message-ID<5395dd57$0$29988$c3e8da3$5496439d@news.astraweb.com>
In reply to#73043
On Mon, 09 Jun 2014 18:57:28 +0300, Paul Sokolovsky wrote:

> Hello,
> 
> On Mon, 9 Jun 2014 08:34:42 -0700 (PDT) Roy Smith <roy@panix.com> wrote:
> 
>> We noticed recently that:
>> 
>> >>> None in 'foo'
>> 
>> raises (at least in Python 2.7)
>> 
>> TypeError: 'in <string>' requires string as left operand, not NoneType
>> 
>> This is surprising.  The description of the 'in' operatator is, 'True
>> if an item of s is equal to x, else False	'.  From that, I would 
assume
>> it behaves as if it were written:
>> 
>> for item in iterable:
>>     if item == x:
>>         return True
>> else:
>>     return False
>> 
>> why the extra type check for str.__contains__()?  That seems very
>> unpythonic.  Duck typing, and all that. --
> 
> This is very Pythonic, Python is strictly typed language. There's no way
> None could possibly be "inside" a string, 

Then `None in some_string` could immediately return False, instead of 
raising an exception.


> so if you're trying to look
> for it there, you're doing something wrong, and told so.

This, I think, is the important factor. `x in somestring` is almost 
always an error if x is not a string. If you want to accept None as well:

x is not None and x in somestring 

does the job nicely.


-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]


#73049

FromChris Angelico <rosuav@gmail.com>
Date2014-06-10 02:31 +1000
Message-ID<mailman.10926.1402331523.18130.python-list@python.org>
In reply to#73047
On Tue, Jun 10, 2014 at 2:14 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>> This is very Pythonic, Python is strictly typed language. There's no way
>> None could possibly be "inside" a string,
>
> Then `None in some_string` could immediately return False, instead of
> raising an exception.

Note, by the way, that CPython does have some optimizations that
immediately return False. If you ask if a 16-bit string is in an 8-bit
string, eg "\u1234" in "asdf", it knows instantly that it cannot
possibly be, and it just returns false. The "None in string" check is
different, and deliberately so.

I do prefer the thrown error. Some things make absolutely no sense,
and even if it's technically valid to say "No, the integer 61 is not
in the string 'asdf'", it's likely to be helpful to someone who thinks
that characters and integers are equivalent. You'll get an exception
immediately, instead of trying to figure out why it's returning False.

ChrisA

[toc] | [prev] | [next] | [standalone]


#73044

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-06-09 16:07 +0000
Message-ID<5395dbba$0$29988$c3e8da3$5496439d@news.astraweb.com>
In reply to#73038
On Mon, 09 Jun 2014 08:34:42 -0700, Roy Smith wrote:

> We noticed recently that:
> 
>>>> None in 'foo'
> 
> raises (at least in Python 2.7)

That goes back to at least Python 1.5, when member tests only accepted a 
single character, not a substring:


>>> None in "abc"
Traceback (innermost last):
  File "<stdin>", line 1, in ?
TypeError: string member test needs char left operand


It's a matter of taste whether predicate functions should always return a 
bool, or sometimes raise an exception. Would you be surprised that this 
raises TypeError?

"my string".startswith(None)


A predicate function could swallow any exception, e.g. be the logical 
equivalent of:

try:
    return True if the condition holds, else return False
except:
    return False  # or True as needed


but that is, I think, an anti-pattern, as it tends to hide errors rather 
than be useful. Most of the time, doing `[] in "xyz"` is an error, so 
returning False is not a useful thing to do.

I think that Python has been moving away from the "swallow exceptions" 
model in favour of letting errors propagate. E.g. hasattr used to swallow 
a lot more exceptions than it does now, and order comparisons (less than, 
greater than etc.) of dissimilar types used to return a version-dependent 
arbitrary but consistent result (e.g. all ints compared less than all 
strings), but in Python 3 that is now an error.



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/

[toc] | [prev] | [next] | [standalone]


#73045

FromMRAB <python@mrabarnett.plus.com>
Date2014-06-09 17:06 +0100
Message-ID<mailman.10923.1402330151.18130.python-list@python.org>
In reply to#73038
On 2014-06-09 16:34, Roy Smith wrote:
> We noticed recently that:
>
>>>> None in 'foo'
>
> raises (at least in Python 2.7)
>
> TypeError: 'in <string>' requires string as left operand, not NoneType
>
> This is surprising.  The description of the 'in' operatator is, 'True if an item of s is equal to x, else False	'.  From that, I would assume it behaves as if it were written:
>
> for item in iterable:
>      if item == x:
>          return True
> else:
>      return False
>
> why the extra type check for str.__contains__()?  That seems very unpythonic.  Duck typing, and all that.
>
When working with strings, it's not entirely the same. For example:

 >>> 'oo' in 'foo'
True

If you iterated over the string, it would return False.

[toc] | [prev] | [next] | [standalone]


#73046

FromShiyao Ma <i@introo.me>
Date2014-06-10 00:13 +0800
Message-ID<mailman.10924.1402330438.18130.python-list@python.org>
In reply to#73038

[Multipart message — attachments visible in raw view] — view raw

2014-06-09 23:34 GMT+08:00 Roy Smith <roy@panix.com>:

> We noticed recently that:
>
> >>> None in 'foo'
>
> raises (at least in Python 2.7)
>
> TypeError: 'in <string>' requires string as left operand, not NoneType
>
> This is surprising.  The description of the 'in' operatator is, 'True if
> an item of s is equal to x, else False '.  From that, I would assume it
> behaves as if it were written:
>
> for item in iterable:
>     if item == x:
>         return True
> else:
>     return False
>
> why the extra type check for str.__contains__()?  That seems very
> unpythonic.  Duck typing, and all that.
>

It's a little bit inconsistent.  But it's clearly documented here:
https://docs.python.org/3/reference/expressions.html#in

Which, according to its own logic, the string is not  a *container* type.
It's just some chars, and that totally makes sense for to restrict the type
of x in "str" to be convertible to type str. On the other hand, containers
like list, and tuple, they are heterogeneous by default in Python, so a
item by item comparison is needed.



-- 

吾輩は猫である。ホームーページはhttp://introo.me。

[toc] | [prev] | [next] | [standalone]


#73050

FromRoy Smith <roy@panix.com>
Date2014-06-09 12:53 -0400
Message-ID<mailman.10927.1402332848.18130.python-list@python.org>
In reply to#73038

[Multipart message — attachments visible in raw view] — view raw

On Jun 9, 2014, at 11:57 AM, Paul Sokolovsky wrote:

> This is very Pythonic, Python is strictly typed language. There's no
> way None could possibly be "inside" a string, so if you're trying to
> look for it there, you're doing something wrong, and told so.

Well, the code we've got is:

           hourly_data = [(t if status in 'CSRP' else None) for (t, status) in hours]

where status can be None.  I don't think I'm doing anything wrong.  I wrote exactly what I mean :-)  We've changed it to:

          hourly_data = [(t if (status and status in 'CSRP') else None) for (t, status) in hours]

but that's pretty ugly.  In retrospect, I suspect:

          hourly_data = [(t if status in set('CSRP') else None) for (t, status) in hours]

is a little cleaner.


---
Roy Smith
roy@panix.com

[toc] | [prev] | [next] | [standalone]


#73052

FromChris Angelico <rosuav@gmail.com>
Date2014-06-10 02:59 +1000
Message-ID<mailman.10929.1402333173.18130.python-list@python.org>
In reply to#73038
On Tue, Jun 10, 2014 at 2:53 AM, Roy Smith <roy@panix.com> wrote:
> In retrospect, I suspect:
>
>           hourly_data = [(t if status in set('CSRP') else None) for (t,
> status) in hours]
>
> is a little cleaner.

I'd go with this. It's clearer that a status of 'SR' should result in
False, not True. (Presumably that can never happen, but it's easier to
read.) I'd be inclined to use set literal syntax, even though it's a
bit longer - again to make it clear that these are four separate
strings that you're checking against.

Alternatively, you could go "if status or '0' in 'CSRP", which would
work, but be quite cryptic. (It would also mean that '' is not deemed
to be in the string, same as the set() transformation does.)

ChrisA

[toc] | [prev] | [next] | [standalone]


#73053

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-06-09 11:22 -0600
Message-ID<mailman.10930.1402334970.18130.python-list@python.org>
In reply to#73038
On Mon, Jun 9, 2014 at 10:59 AM, Chris Angelico <rosuav@gmail.com> wrote:
> On Tue, Jun 10, 2014 at 2:53 AM, Roy Smith <roy@panix.com> wrote:
>> In retrospect, I suspect:
>>
>>           hourly_data = [(t if status in set('CSRP') else None) for (t,
>> status) in hours]
>>
>> is a little cleaner.
>
> I'd go with this. It's clearer that a status of 'SR' should result in
> False, not True. (Presumably that can never happen, but it's easier to
> read.) I'd be inclined to use set literal syntax, even though it's a
> bit longer - again to make it clear that these are four separate
> strings that you're checking against.

Depending on how much work this has to do, I might also consider
moving the set construction outside the list comprehension since it
doesn't need to be repeated on every iteration.

[toc] | [prev] | [next] | [standalone]


#73054

FromChris Angelico <rosuav@gmail.com>
Date2014-06-10 03:40 +1000
Message-ID<mailman.10931.1402335635.18130.python-list@python.org>
In reply to#73038
On Tue, Jun 10, 2014 at 3:22 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Mon, Jun 9, 2014 at 10:59 AM, Chris Angelico <rosuav@gmail.com> wrote:
>> On Tue, Jun 10, 2014 at 2:53 AM, Roy Smith <roy@panix.com> wrote:
>>> In retrospect, I suspect:
>>>
>>>           hourly_data = [(t if status in set('CSRP') else None) for (t,
>>> status) in hours]
>>>
>>> is a little cleaner.
>>
>> I'd go with this. It's clearer that a status of 'SR' should result in
>> False, not True. (Presumably that can never happen, but it's easier to
>> read.) I'd be inclined to use set literal syntax, even though it's a
>> bit longer - again to make it clear that these are four separate
>> strings that you're checking against.
>
> Depending on how much work this has to do, I might also consider
> moving the set construction outside the list comprehension since it
> doesn't need to be repeated on every iteration.

Set literal notation will accomplish that, too, for what it's worth.

>>> def x():
hourly_data = [(t if status in {'C','S','R','P'} else None) for (t,
status) in hours]

>>> dis.dis(x)
  2           0 LOAD_CONST               1 (<code object <listcomp> at
0x012BE660, file "<pyshell#10>", line 2>)
              3 LOAD_CONST               2 ('x.<locals>.<listcomp>')
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              0 (hours)
             12 GET_ITER
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 STORE_FAST               0 (hourly_data)
             19 LOAD_CONST               0 (None)
             22 RETURN_VALUE
>>> dis.dis(x.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                36 (to 45)
              9 UNPACK_SEQUENCE          2
             12 STORE_FAST               1 (t)
             15 STORE_FAST               2 (status)
             18 LOAD_FAST                2 (status)
             21 LOAD_CONST               5 (frozenset({'R', 'S', 'C', 'P'}))
             24 COMPARE_OP               6 (in)
             27 POP_JUMP_IF_FALSE       36
             30 LOAD_FAST                1 (t)
             33 JUMP_FORWARD             3 (to 39)
        >>   36 LOAD_CONST               4 (None)
        >>   39 LIST_APPEND              2
             42 JUMP_ABSOLUTE            6
        >>   45 RETURN_VALUE
>>> isinstance(x.__code__.co_consts[1].co_consts[5],set)
False

Interestingly, the literal appears to be a frozenset rather than a
regular set. The compiler must have figured out that it can never be
changed, and optimized.

Also, this is the first time I've seen None as a constant other than
the first. Usually co_consts[0] is None, but this time co_consts[4] is
None.

ChrisA

[toc] | [prev] | [next] | [standalone]


#73055

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-06-09 11:58 -0600
Message-ID<mailman.10932.1402336745.18130.python-list@python.org>
In reply to#73038
On Mon, Jun 9, 2014 at 11:40 AM, Chris Angelico <rosuav@gmail.com> wrote:
> Also, this is the first time I've seen None as a constant other than
> the first. Usually co_consts[0] is None, but this time co_consts[4] is
> None.

Functions always seem to have None as the first constant, but modules
and classes are other examples that don't.

>>> co = compile("class MyClass: pass", '', 'exec')
>>> co.co_consts
(<code object MyClass at 0x7f32aa0a3c00, file "", line 1>, 'MyClass', None)
>>> co.co_consts[0].co_consts
('MyClass', None)

[toc] | [prev] | [next] | [standalone]


#73056

FromChris Angelico <rosuav@gmail.com>
Date2014-06-10 04:02 +1000
Message-ID<mailman.10933.1402336975.18130.python-list@python.org>
In reply to#73038
On Tue, Jun 10, 2014 at 3:58 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Mon, Jun 9, 2014 at 11:40 AM, Chris Angelico <rosuav@gmail.com> wrote:
>> Also, this is the first time I've seen None as a constant other than
>> the first. Usually co_consts[0] is None, but this time co_consts[4] is
>> None.
>
> Functions always seem to have None as the first constant, but modules
> and classes are other examples that don't.
>
>>>> co = compile("class MyClass: pass", '', 'exec')
>>>> co.co_consts
> (<code object MyClass at 0x7f32aa0a3c00, file "", line 1>, 'MyClass', None)
>>>> co.co_consts[0].co_consts
> ('MyClass', None)

Huh. Learn something every day!

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web