Groups > comp.lang.python > #84454 > unrolled thread

bases misleading error message

Started by	Mario Figueiredo <marfig@gmail.com>
First post	2015-01-24 10:16 +0000
Last post	2015-01-24 22:51 +0100
Articles	20 on this page of 28 — 7 participants

Back to article view | Back to comp.lang.python

  __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 10:16 +0000
    Re: __bases__ misleading error message Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-24 23:43 +1100
      Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 22:14 +0100
        Re: __bases__ misleading error message Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-24 14:45 -0700
          Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 23:09 +0100
            Re: __bases__ misleading error message Chris Angelico <rosuav@gmail.com> - 2015-01-25 09:25 +1100
              Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 23:33 +0100
                Re: __bases__ misleading error message Chris Angelico <rosuav@gmail.com> - 2015-01-25 09:37 +1100
                  Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 23:59 +0100
        Re: __bases__ misleading error message Terry Reedy <tjreedy@udel.edu> - 2015-01-24 16:58 -0500
          Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 23:02 +0100
            Re: __bases__ misleading error message Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-24 15:16 -0700
              Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 23:36 +0100
        Re: __bases__ misleading error message Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-25 14:18 +1100
          Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-25 12:07 +0100
            Re: __bases__ misleading error message Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-25 23:00 +1100
              Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-25 13:49 +0100
                Re: __bases__ misleading error message Marko Rauhamaa <marko@pacujo.net> - 2015-01-25 14:53 +0200
              Re: __bases__ misleading error message Terry Reedy <tjreedy@udel.edu> - 2015-01-25 16:35 -0500
              Re: __bases__ misleading error message Ian Kelly <ian.g.kelly@gmail.com> - 2015-01-25 19:21 -0700
      Re: __bases__ misleading error message Marco Buttu <marco.buttu@gmail.com> - 2015-01-24 23:09 +0100
      Re: __bases__ misleading error message Marco Buttu <marco.buttu@gmail.com> - 2015-01-24 15:12 +0100
    Re: __bases__ misleading error message Terry Reedy <tjreedy@udel.edu> - 2015-01-24 14:24 -0500
      Re: __bases__ misleading error message Mario Figueiredo <marfig@gmail.com> - 2015-01-24 22:03 +0100
      Re: __bases__ misleading error message Marco Buttu <marco.buttu@gmail.com> - 2015-01-24 22:51 +0100
        Re: __bases__ misleading error message Terry Reedy <tjreedy@udel.edu> - 2015-01-24 19:55 -0500
          Re: __bases__ misleading error message Marco Buttu <marco.buttu@gmail.com> - 2015-01-25 11:30 +0100
      Re: __bases__ misleading error message Marco Buttu <marco.buttu@gmail.com> - 2015-01-24 22:51 +0100

Page 1 of 2 [1] 2 Next page →

#84454 — bases misleading error message

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 10:16 +0000
Subject	__bases__ misleading error message
Message-ID	<1a194e0a0b738d205de54180fa7@nntp.aioe.org>

Consider the following code at your REPL of choice

        class Super:
            pass

        class Sub:
            pass

        foo = Sub()

        Sub.__bases__
        foo.__bases__

The last statement originates the following error:

        AttributeError: 'Sub' object has no attribute '__bases__'

Naturally the 'Sub' object has an attribute __bases__. It's the instance 
that has not. So shouldn't the error read like:

        AttributeError: 'Sub' instance has no attribute '__bases__', or
        AttributeError: 'foo' object has no attribute '__bases__'

[toc] | [next] | [standalone]

#84461

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2015-01-24 23:43 +1100
Message-ID	<54c39366$0$13006$c3e8da3$5496439d@news.astraweb.com>
In reply to	#84454

Mario Figueiredo wrote:

> 
> Consider the following code at your REPL of choice
> 
>         class Super:
>             pass

Super is irrelevant here, since it isn't used.

>         class Sub:
>             pass
> 
>         foo = Sub()
> 
>         Sub.__bases__
>         foo.__bases__
> 
> The last statement originates the following error:
> 
>         AttributeError: 'Sub' object has no attribute '__bases__'

It's a bit ambiguous, but the way to read it is to think of object as a
synonym for instance. This is, in my opinion, a Java-ism which is
inappropriate for Python where classes are objects too, but we seem to be
stuck with it.

So we have a Sub instance (object) which has no attribute '__bases__'. This
is no different from:

py> (23).spam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'spam'

> Naturally the 'Sub' object has an attribute __bases__. 

Correct, in the sense that classes are objects too. But in the sense of
object=instance, no. Isn't ambiguous terminology wonderful?

> It's the instance 
> that has not. So shouldn't the error read like:
> 
>         AttributeError: 'Sub' instance has no attribute '__bases__', or
>         AttributeError: 'foo' object has no attribute '__bases__'

The first would be nice. The second is impossible: objects may have no name,
one name, or many names, and they do not know what names they are bound to.
So the Sub instance bound to the name 'foo' doesn't know that its name
is 'foo', so it cannot display it in the error message.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#84493

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 22:14 +0100
Message-ID	<MPG.2f2e298c14dbed69989692@nntp.aioe.org>
In reply to	#84461

In article <54c39366$0$13006$c3e8da3$5496439d@news.astraweb.com>, 
steve+comp.lang.python@pearwood.info says...
> >         AttributeError: 'Sub' instance has no attribute '__bases__', 
> >         AttributeError: 'foo' object has no attribute '__bases__'
> 
> The first would be nice. The second is impossible: objects may have no name,
> one name, or many names, and they do not know what names they are bound to.
> So the Sub instance bound to the name 'foo' doesn't know that its name
> is 'foo', so it cannot display it in the error message.

Thanks for the information! :)

But that begs the OT question: How does Python maps names to memory 
addresses in the interpreter?

    "__main__"
    from module import a_name
    y = a_name + 1

How does python interpreter know how to map 'name' to the correct memory 
location, if this __main__ code is only ran after 'module' code?

[toc] | [prev] | [next] | [standalone]

#84496

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2015-01-24 14:45 -0700
Message-ID	<mailman.18099.1422135976.18130.python-list@python.org>
In reply to	#84493

On Sat, Jan 24, 2015 at 2:14 PM, Mario Figueiredo <marfig@gmail.com> wrote:
> In article <54c39366$0$13006$c3e8da3$5496439d@news.astraweb.com>,
> steve+comp.lang.python@pearwood.info says...
>> >         AttributeError: 'Sub' instance has no attribute '__bases__',
>> >         AttributeError: 'foo' object has no attribute '__bases__'
>>
>> The first would be nice. The second is impossible: objects may have no name,
>> one name, or many names, and they do not know what names they are bound to.
>> So the Sub instance bound to the name 'foo' doesn't know that its name
>> is 'foo', so it cannot display it in the error message.
>
> Thanks for the information! :)
>
> But that begs the OT question:

No, it doesnt. http://en.wikipedia.org/wiki/Begging_the_question

> How does Python maps names to memory
> addresses in the interpreter?

Global variables are looked up in the current stack frame's globals dict.

>>> a = 1
>>> b = 2
>>> globals()['a']
1
>>> globals()['b']
2

Local variables of functions could be handled the same way, but for
efficiency the compiler instead maps the names to indices of a local
variable array associated with the stack frame. Either way, at the C
level the value stored in the dict or array is a pointer to the memory
location of the object.

>     "__main__"
>     from module import a_name
>     y = a_name + 1
>
> How does python interpreter know how to map 'name' to the correct memory
> location, if this __main__ code is only ran after 'module' code?

I'm not sure I'm understanding what you're asking, but the import
statement imports the module, looks up "a_name" in that module's
globals dict, and binds the same object to a_name in the current
module's globals dict.

[toc] | [prev] | [next] | [standalone]

#84502

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 23:09 +0100
Message-ID	<MPG.2f2e3651befa4c9989694@nntp.aioe.org>
In reply to	#84496

In article <mailman.18099.1422135976.18130.python-list@python.org>, 
ian.g.kelly@gmail.com says...
> 
> On Sat, Jan 24, 2015 at 2:14 PM, Mario Figueiredo <marfig@gmail.com> wrote:
> > But that begs the OT question:
> 
> No, it doesnt. http://en.wikipedia.org/wiki/Begging_the_question

Cute.

> I'm not sure I'm understanding what you're asking, but the import
> statement imports the module, looks up "a_name" in that module's
> globals dict, and binds the same object to a_name in the current
> module's globals dict.

Meaning the interpreter knows a variable's name. Which would allow it to 
produce an error message such as:

    AttributeError: 'foo' object has no attribute '__bases__'

For the following code:

    class Sub:
        pass
 
    foo = Sub()
    foo.__bases__

[toc] | [prev] | [next] | [standalone]

#84505

From	Chris Angelico <rosuav@gmail.com>
Date	2015-01-25 09:25 +1100
Message-ID	<mailman.18104.1422138354.18130.python-list@python.org>
In reply to	#84502

On Sun, Jan 25, 2015 at 9:09 AM, Mario Figueiredo <marfig@gmail.com> wrote:
> Meaning the interpreter knows a variable's name. Which would allow it to
> produce an error message such as:
>
>     AttributeError: 'foo' object has no attribute '__bases__'
>
> For the following code:
>
>     class Sub:
>         pass
>
>     foo = Sub()
>     foo.__bases__

Let me explain by way of analogy. You have ten shoeboxes to store your
stuff in. I hand you a thing and say "Here, put this into shoebox #4".
Then someone else comes along and says, "I need the thing from shoebox
#4", so you give him that thing. Now, he hands that thing to someone
else and asks him which shoebox it came out of, just by looking at the
thing itself. How can he say? The thing doesn't have any way of
knowing what shoebox it came out of.

Python names reference objects. But once you get an object, there's no
way of knowing which name was used to get to it. There might be one
such name; there might be more than one; and there might not be any.
You can't identify an object by the name it's bound to, but you can
identify it by something that's always true of the object itself, like
its type.

There are a few cases where names are so useful that they get attached
to the objects themselves. The 'def' and 'class' statements create
objects and also record the names used. But you still can't identify
what name was used to reference something:

>>> def func(x): print("x = %s"%x)
...
>>> func(123)
x = 123
>>> show = func
>>> show(234)
x = 234
>>> func = "not a function"
>>> func(123)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
>>> show(123)
x = 123
>>> show
<function func at 0x7f36744a30d0>

No matter what name I use to reference the function, it's still called
"func". Nobody can ever know that I'm identifying it by the name
'show' now.

ChrisA

[toc] | [prev] | [next] | [standalone]

#84507

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 23:33 +0100
Message-ID	<MPG.2f2e3c017bcaae94989696@nntp.aioe.org>
In reply to	#84505

In article <mailman.18104.1422138354.18130.python-list@python.org>, 
rosuav@gmail.com says...
> 
> Let me explain by way of analogy. 
[snipped]

Gotcha! Thanks for the explanation :)

[toc] | [prev] | [next] | [standalone]

#84509

From	Chris Angelico <rosuav@gmail.com>
Date	2015-01-25 09:37 +1100
Message-ID	<mailman.18105.1422139063.18130.python-list@python.org>
In reply to	#84507

On Sun, Jan 25, 2015 at 9:33 AM, Mario Figueiredo <marfig@gmail.com> wrote:
> In article <mailman.18104.1422138354.18130.python-list@python.org>,
> rosuav@gmail.com says...
>>
>> Let me explain by way of analogy.
> [snipped]
>
> Gotcha! Thanks for the explanation :)

Awesome! I'm always a bit wary of analogies... sometimes they're
really helpful, other times they're unhelpful and confusing. Glad this
was one of the better cases.

ChrisA

[toc] | [prev] | [next] | [standalone]

#84511

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 23:59 +0100
Message-ID	<MPG.2f2e42298df5a9c2989698@nntp.aioe.org>
In reply to	#84509

In article <mailman.18105.1422139063.18130.python-list@python.org>, 
rosuav@gmail.com says...
> Awesome! I'm always a bit wary of analogies... sometimes they're
> really helpful, other times they're unhelpful and confusing.

Yeah. Your's was all it took :)

The thing with analogies is to never take them literally. They are 
analogies, after all. But there is this old funny thing we humans seem 
to share that an analogy should be dissected like it was a scientific 
paper.

- You say shoes in a box? Why, but memory addresses aren't boxes. 
Besides a box can only take shoes this big. Memory addresses can take 
any size object.

- No I meant.. Look, just imagine shoes in a box.

- Alright...

- Now the other person will be handed the shoe you asked. They don't 
know what box it came from. What this mea...

- How come?

- How come what?

- Why don't they know? They could just agree to know what box the shoe 
came from. Problem solved.

- No, but I am trying to illustrate how it works. Not how it could work.

- I still don't get it. Why does it work like that. Seems stupid...

- It's not. There are specific reasons to not know. It's got to do with 
the process stack and efficiency and...

- Right.

And there's also the most annoying of all, the smartasses that like to 
stay hidden in the shadows and as soon as they see an analogy they jump 
in and tada!

"It's not true that memory spaces can hold any object size. It is 
limited by the computer available memory" -- well, duh!

"Is that a float you are using to compute a salary raise in your code 
snippet meant as an example to illustrate code syntax? Hahaha" -- Sigh!

[toc] | [prev] | [next] | [standalone]

#84500

From	Terry Reedy <tjreedy@udel.edu>
Date	2015-01-24 16:58 -0500
Message-ID	<mailman.18102.1422136703.18130.python-list@python.org>
In reply to	#84493

On 1/24/2015 4:14 PM, Mario Figueiredo wrote:
> In article <54c39366$0$13006$c3e8da3$5496439d@news.astraweb.com>,
> steve+comp.lang.python@pearwood.info says...
>>>          AttributeError: 'Sub' instance has no attribute '__bases__',
>>>          AttributeError: 'foo' object has no attribute '__bases__'
>>
>> The first would be nice. The second is impossible: objects may have no name,
>> one name, or many names, and they do not know what names they are bound to.
>> So the Sub instance bound to the name 'foo' doesn't know that its name
>> is 'foo', so it cannot display it in the error message.
>
> Thanks for the information! :)
>
> But that begs the OT question: How does Python maps names to memory
> addresses in the interpreter?

Python the language maps names to objects that have identity, type, and 
value.  The CPython implementation does the mapping with a hash table 
and C pointers (to computer memory addresses), but addresses are not 
part of the language definition.  Neuroscientists still puzzle over how 
we do such mapping.

>      "__main__"
>      from module import a_name

A module is a namespace associating names with objects.  This statememt 
says to import the a_name to object association from module and add it 
to __main__

>      y = a_name + 1

This statement uses the imported association in __main__ to access the 
object and add 1, and bind 'y' to the resulting object.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#84501

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 23:02 +0100
Message-ID	<MPG.2f2e34c1b650760e989693@nntp.aioe.org>
In reply to	#84500

In article <mailman.18102.1422136703.18130.python-list@python.org>, 
tjreedy@udel.edu says...
> 
> >      "__main__"
> >      from module import a_name
> 
> A module is a namespace associating names with objects.  This statememt 
> says to import the a_name to object association from module and add it 
> to __main__
> 
> >      y = a_name + 1
> 
> This statement uses the imported association in __main__ to access the 
> object and add 1, and bind 'y' to the resulting object.


But I'm being told the interpreter has no knowledge of a variable name. 
So, how does the interpreter know, once it reaches the assigment line 
above, how to map a_name to the correct object in memory?

[toc] | [prev] | [next] | [standalone]

#84504

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2015-01-24 15:16 -0700
Message-ID	<mailman.18103.1422137849.18130.python-list@python.org>
In reply to	#84501

On Sat, Jan 24, 2015 at 3:02 PM, Mario Figueiredo <marfig@gmail.com> wrote:
> In article <mailman.18102.1422136703.18130.python-list@python.org>,
> tjreedy@udel.edu says...
>>
>> >      "__main__"
>> >      from module import a_name
>>
>> A module is a namespace associating names with objects.  This statememt
>> says to import the a_name to object association from module and add it
>> to __main__
>>
>> >      y = a_name + 1
>>
>> This statement uses the imported association in __main__ to access the
>> object and add 1, and bind 'y' to the resulting object.
>
>
> But I'm being told the interpreter has no knowledge of a variable name.
> So, how does the interpreter know, once it reaches the assigment line
> above, how to map a_name to the correct object in memory?

No, you're being told that the *object* doesn't know the names of the
variables that it's bound to. In the context above, the variable is
right there under that name in the globals dict, as can be seen in the
disassembly:

>>> import dis
>>> dis.dis("y = a_name + 1")
  1           0 LOAD_NAME                0 (a_name)
              3 LOAD_CONST               0 (1)
              6 BINARY_ADD
              7 STORE_NAME               1 (y)
             10 LOAD_CONST               1 (None)
             13 RETURN_VALUE

Now what happens in the byte code if we try to access an attribute on
that object?

>>> dis.dis("a_name.__bases__")
  1           0 LOAD_NAME                0 (a_name)
              3 LOAD_ATTR                1 (__bases__)
              6 RETURN_VALUE

1) The value of a_name is looked up and pushed onto the stack.

2) The interpreter attempts to load the attribute __bases__ of
whatever object is on the top of the stack. There is no name
associated with that object at this point; it's just an object.

Now imagine if the Python code in question were instead this:

    def get_an_object(): return "foo"
    get_an_object().__bases__

Would you really expect the interpreter to come up with a message like
"Return value of get_an_object() has no attribute '__bases__'"?

[toc] | [prev] | [next] | [standalone]

#84508

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-24 23:36 +0100
Message-ID	<MPG.2f2e3cae35dc9068989697@nntp.aioe.org>
In reply to	#84504

In article <mailman.18103.1422137849.18130.python-list@python.org>, 
ian.g.kelly@gmail.com says...
> 
> No, you're being told that the *object* doesn't know the names of the
> variables that it's bound to. In the context above, the variable is
> right there under that name in the globals dict, as can be seen in the
> disassembly:
[snipped]

Yes. I got it now. I misinterpreted Steven words.

[toc] | [prev] | [next] | [standalone]

#84548

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2015-01-25 14:18 +1100
Message-ID	<54c4606a$0$13002$c3e8da3$5496439d@news.astraweb.com>
In reply to	#84493

Mario Figueiredo wrote:

> But that begs the OT question: How does Python maps names to memory
> addresses in the interpreter?

It doesn't.

You are thinking of an execution model like C or Pascal, where variables are
best thought of as fixed memory addresses. But Python, like many modern
languages (Java, Ruby, Lua, ...) uses a name-binding model.

Semantically, the fixed memory address model means that each variable is
like a fixed-width bucket, where the size depends on the type. That's why
the compiler needs to associate a fixed type with each variable, so it
knows how much space to allocate and how to initialise the bytes:

m = [0000]
n = [0000]
x = [8FFFFFFF]

Assigning a value to a variable ("m = 42", hex 2A) results in the compiler
storing that value in the bucket; assignment makes a copy: "n = m" leads to
two independent buckets with the same value:

m = [002A]
n = [002A]

Values and variables are dependent on each other. You can't have a variable
with no value, and you can't have a value with no variable. (This is an
over-simplification, but mostly true.)

The name-binding model is different. Values (objects) and names are
independent. Values can exist even if they have no name (although the
garbage collector will delete them as soon as they are unused). The
compiler associates a name to a value. One good mental model is to think of
the compiler attaching a piece of string from the name to its associated
object. Assigning n = m means that both names end up tied to the same
object, and there can be objects with no associated name. (So long as
*something* refers to them, the garbage collector will leave them be.)

m -----------------+----------- 0x2A
n ----------------/
x ----------------------------- 1.2345
s -----\                        "Hello world"
        +---------------------- "Goodbye now"

Under the hood, this is usually implemented using pointers. If you are
familiar with pointer semantics, you might think of these pieces of string
as pointers, except that you cannot do pointer arithmetic on them. But that
is merely the *implementation* of the language's variable model.

In Python, global variables use a dict, and there is a function to retrieve
that dict:

py> d = globals()
py> d['x'] = 23  # don't do this!
py> x
23

It's not *wrong* or forbidden to write to globals this way. It's just
unnecessary.

>     "__main__"
>     from module import a_name
>     y = a_name + 1
> 
> How does python interpreter know how to map 'name' to the correct memory
> location, if this __main__ code is only ran after 'module' code?

When the statement `from module import a_name` executes, Python:

(1) imports module;
(2) looks up "a_name" in module's namespace (a dict);
(3) creates an entry "a_name" in the current namespace (assuming 
    one doesn't already exist);
(4) and binds it to the object found in Step 2.

When it executes `y = a_name + 1`, Python:

(1) looks up the name "a_name" in the current namespace;
(2) creates the anonymous object 1, unbound to any name;
(3) calls + with those two objects;
(4) which (assuming it succeeds) creates a new object;
(5) creates an entry "y" in the current namespace;
(6) and binds it to the object returned in Step 4.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#84563

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-25 12:07 +0100
Message-ID	<MPG.2f2eecca37bc450998969a@nntp.aioe.org>
In reply to	#84548

In article <54c4606a$0$13002$c3e8da3$5496439d@news.astraweb.com>, 
steve+comp.lang.python@pearwood.info says...
> 
> It doesn't.

Your explanation was great Steven. Thank you. But raises one question...

> 
> Assigning a value to a variable ("m = 42", hex 2A) results in the compiler
> storing that value in the bucket; assignment makes a copy: "n = m" leads to
> two independent buckets with the same value:
> 
> m = [002A]
> n = [002A]

I'm still in the process of learning Python. So, that's probably why 
this is unexpected to me.

I was under the impression that what python did was keep a lookup table 
pointing to memory. Every variable gets an entry as type descriptor and 
a pointer to a memory address, where the variable data resides.

(UDTs may be special in that they would have more than one entry, one 
for each enclosing def and declared attribute)

In the example above, the n and m buckets would hold pointers, not 
binary values. And because they are immutable objects, n and m pointers 
would be different. Not equal. But in the case of mutable objects, n = m 
would result in m having the same pointer address as n.

[toc] | [prev] | [next] | [standalone]

#84564

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2015-01-25 23:00 +1100
Message-ID	<54c4dae1$0$13005$c3e8da3$5496439d@news.astraweb.com>
In reply to	#84563

Mario Figueiredo wrote:

> In article <54c4606a$0$13002$c3e8da3$5496439d@news.astraweb.com>,
> steve+comp.lang.python@pearwood.info says...
>> 
>> It doesn't.
> 
> Your explanation was great Steven. Thank you. But raises one question...
> 
>> 
>> Assigning a value to a variable ("m = 42", hex 2A) results in the
>> compiler storing that value in the bucket; assignment makes a copy: "n =
>> m" leads to two independent buckets with the same value:
>> 
>> m = [002A]
>> n = [002A]

Maybe I wasn't clear enough. The above is used by languages like C or
Pascal, which use fixed memory locations for variables. If I gave the
impression this was Python, I am sorry.

> I'm still in the process of learning Python. So, that's probably why
> this is unexpected to me.
> 
> I was under the impression that what python did was keep a lookup table
> pointing to memory. Every variable gets an entry as type descriptor and
> a pointer to a memory address, where the variable data resides.

This sounds more or less correct, at least for CPython. CPython is
the "reference implementation", and probably the version you use when you
run Python. But it is not the only one, and they can be different.

(E.g. in Jython, the Python interpreter is built using Java, not C. You
can't work with pointers to memory addresses in Java, and the Java garbage
collector is free to move objects around when needed.)

In CPython, objects live in the heap, and Python tracks them using a
pointer. So when you bind a name to a value:

    x = 23  # what you type

what happens is that Python sets a key + value in the global namespace (a
dictionary):

    globals()['x'] = 23  # what Python runs

and the globals() dict will then look something like this:

    {'x': 23, 'colour': 'red', 'y': 42}

(Note: *local* variables are similar but not quite the same. They're also
more complicated, so let's skip them for now.)

What happens inside the dictionary? Dictionaries are "hash tables", so they
are basically a big array of cells, and each cell is a pair of pointers,
one for the key and one for the value:

    [dictionary header]
    [blank] 
    [blank] 
    [ptr to the string 'y', ptr to the int 42]
    [blank] 
    [ptr to 'x', ptr to 23]
    [blank]
    [blank]
    [blank]
    [ptr to 'colour', ptr to 'red']
    [blank]
    ...

Notice that the order is unpredictable. Also, don't take this picture too
literally. Dicts are highly optimized, highly tuned and in active
development, the *actual* design of Python dicts may vary. But this is a
reasonable simplified view of how they could be designed.

The important thing to remember is that while CPython uses pointers under
the hood to make the interpreter work, pointers are not part of the Python
language. There is no way in Python to get a pointer to an object, or
increment a pointer, or dereference a pointer. You just use objects, and
the interpreter handles all the pointer stuff behind the scenes.

> (UDTs may be special in that they would have more than one entry, one
> for each enclosing def and declared attribute)
> 
> In the example above, the n and m buckets would hold pointers, not
> binary values. And because they are immutable objects, n and m pointers
> would be different. Not equal. But in the case of mutable objects, n = m
> would result in m having the same pointer address as n.

No, this is certainly not the case! Python uses *exactly* the same rules for
mutable and immutable objects. In fact, Python can't tell what values are
mutable or immutable until it tries to modify it.

Remember I said that name-binding languages operate using a model of pieces
of string between the name and the object? Here are two names bound to the
same object:

m -----------+--------------- 0x2a
n ----------/ 

Obviously Python doesn't *literally* use a piece of string :-) so what
happens under the hood? Pointers again, at least in CPython.

In this case, if we look deep inside our globals dictionary again, we will
see two cells:

     [ptr to the string "m", ptr to the int 0x2a]
     [ptr to the string "n", ptr to the int 0x2a]

The two int pointers point to the same object. This is guaranteed by the
language:

m = 42
n = m
assert id(m) == id(n)

Both objects have the same ID and are the same object at the same memory
location. Assignment in Python NEVER makes a copy of the value being
assigned.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#84566

From	Mario Figueiredo <marfig@gmail.com>
Date	2015-01-25 13:49 +0100
Message-ID	<MPG.2f2f0499338c0a6598969b@nntp.aioe.org>
In reply to	#84564

In article <54c4dae1$0$13005$c3e8da3$5496439d@news.astraweb.com>, 
steve+comp.lang.python@pearwood.info says...
> [...]

Most excellent. Thanks for the hard work, explaining this to me. :)

Knowing Python internals is something that will end benefiting me in the 
long run. There's much to be gained by knowing the inner working of your 
programming language...

Python is missing an under-the-hood book, I suppose. Tracing through 
Python source code to learn this stuff isn't easy unless we know what we 
are looking for.

[toc] | [prev] | [next] | [standalone]

#84567

From	Marko Rauhamaa <marko@pacujo.net>
Date	2015-01-25 14:53 +0200
Message-ID	<87lhkrc97t.fsf@elektro.pacujo.net>
In reply to	#84566

Mario Figueiredo <marfig@gmail.com>:

> Knowing Python internals is something that will end benefiting me in
> the long run. There's much to be gained by knowing the inner working
> of your programming language...
>
> Python is missing an under-the-hood book, I suppose. Tracing through
> Python source code to learn this stuff isn't easy unless we know what
> we are looking for.

One must only be careful to distinguish implementation choices from the
abstract definitions. The Python Language Reference is the official
standard:

   <URL: https://docs.python.org/3/reference/index.html>


Marko

[toc] | [prev] | [next] | [standalone]

#84581

From	Terry Reedy <tjreedy@udel.edu>
Date	2015-01-25 16:35 -0500
Message-ID	<mailman.18135.1422221769.18130.python-list@python.org>
In reply to	#84564

On 1/25/2015 7:00 AM, Steven D'Aprano wrote:

> What happens inside the dictionary? Dictionaries are "hash tables", so they
> are basically a big array of cells, and each cell is a pair of pointers,
> one for the key and one for the value:
>
>      [dictionary header]
>      [blank]
>      [blank]
>      [ptr to the string 'y', ptr to the int 42]

At the moment, for CPython, each entry has 3 items, with the first being 
the cached hash of the key.  Hash comparison is first used to test 
whether keys are equal.

   [hash('y'), ptr('y'), ptr(42)]

>      [blank]
>      [ptr to 'x', ptr to 23]
>      [blank]
>      [blank]
>      [blank]
>      [ptr to 'colour', ptr to 'red']
>      [blank]

As you say, these are implementation details.  CPython dicts for the 
instances of at least some classes have a different, specialized 
structure, with two arrays.

In the above, [blank] entries, which are about 1/2 to 2/3 of the 
entries, take the same space as real entries (12 to 24 bytes).  Raymond 
H. has proposed that the standard dict have two arrays like so:

1. the first array is a sparse array of indexes into the second array: 
[b, b, 2, b, 0, b, b, b, 1, b] (where b might be -1 interpreted as 
<blank>), using only as many bytes as needed for the maximum index.

2. the second array is a compact array of entries in insertion order, 
such as

     [hash, ptr to 'x', ptr to 23]
     [hash, ptr to 'colour', ptr to 'red']
     [hash, ptr to the string 'y', ptr to the int 42]

Iteration would use the compact array, making all dicts OrderedDicts. 
Pypy has already switched to this.  It seems that on modern processors 
with multilevel on-chip caches, the space reduction leads to cache-miss 
reductions that compensate for the indirection cost.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#84587

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2015-01-25 19:21 -0700
Message-ID	<mailman.18139.1422238882.18130.python-list@python.org>
In reply to	#84564

[Multipart message — attachments visible in raw view] — view raw

On Jan 25, 2015 2:37 PM, "Terry Reedy" <tjreedy@udel.edu> wrote:
> 2. the second array is a compact array of entries in insertion order,
such as
>
>     [hash, ptr to 'x', ptr to 23]
>     [hash, ptr to 'colour', ptr to 'red']
>     [hash, ptr to the string 'y', ptr to the int 42]
>
> Iteration would use the compact array, making all dicts OrderedDicts.
Pypy has already switched to this.  It seems that on modern processors with
multilevel on-chip caches, the space reduction leads to cache-miss
reductions that compensate for the indirection cost.

Deletion becomes O(n) though. Has there been any investigation into how
commonly deletion of keys is done?

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

__bases__ misleading error message

Contents

#84454 — __bases__ misleading error message

#84461

#84493

#84496

#84502

#84505

#84507

#84509

#84511

#84500

#84501

#84504

#84508

#84548

#84563

#84564

#84566

#84567

#84581

#84587

bases misleading error message

#84454 — bases misleading error message