Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #65137 > unrolled thread

Re: __init__ is the initialiser

Started byNed Batchelder <ned@nedbatchelder.com>
First post2014-01-31 14:52 -0500
Last post2014-02-03 11:02 +1100
Articles 17 — 9 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: __init__ is the initialiser Ned Batchelder <ned@nedbatchelder.com> - 2014-01-31 14:52 -0500
    Re: __init__ is the initialiser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-02-01 03:42 +0000
      Re: __init__ is the initialiser Chris Angelico <rosuav@gmail.com> - 2014-02-01 15:35 +1100
        Re: __init__ is the initialiser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-02-01 05:51 +0000
          Re: __init__ is the initialiser Ethan Furman <ethan@stoneleaf.us> - 2014-02-01 00:28 -0800
      Re: __init__ is the initialiser Ethan Furman <ethan@stoneleaf.us> - 2014-01-31 20:55 -0800
      Re: __init__ is the initialiser Ned Batchelder <ned@nedbatchelder.com> - 2014-02-01 07:28 -0500
        Re: __init__ is the initialiser Roy Smith <roy@panix.com> - 2014-02-01 09:40 -0500
          Re: __init__ is the initialiser Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-01 15:07 +0000
            Re: __init__ is the initialiser Roy Smith <roy@panix.com> - 2014-02-01 11:17 -0500
      Re: __init__ is the initialiser Tim Delaney <timothy.c.delaney@gmail.com> - 2014-02-02 07:09 +1100
        Re: __init__ is the initialiser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-02-02 01:28 +0000
      Re: __init__ is the initialiser Ben Finney <ben+python@benfinney.id.au> - 2014-02-02 15:27 +1100
      Re: __init__ is the initialiser Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-02-03 12:38 +1300
        Re: __init__ is the initialiser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-02-03 00:33 +0000
          Re: __init__ is the initialiser Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-02-04 12:47 +1300
      Re: __init__ is the initialiser Tim Delaney <timothy.c.delaney@gmail.com> - 2014-02-03 11:02 +1100

#65137 — Re: __init__ is the initialiser

FromNed Batchelder <ned@nedbatchelder.com>
Date2014-01-31 14:52 -0500
SubjectRe: __init__ is the initialiser
Message-ID<mailman.6217.1391197950.18130.python-list@python.org>
On 1/31/14 2:33 PM, Mark Lawrence wrote:
>  From http://docs.python.org/3/reference/datamodel.html#object.__init__
> which states:-
>
> "
> Called when the instance is created. The arguments are those passed to
> the class constructor expression. If a base class has an __init__()
> method, the derived class’s __init__() method, if any, must explicitly
> call it to ensure proper initialization of the base class part of the
> instance; for example: BaseClass.__init__(self, [args...]). As a special
> constraint on constructors, no value may be returned; doing so will
> cause a TypeError to be raised at runtime.
> "
>
> Should the wording of the above be changed to clearly reflect that we
> have an initialiser here and that __new__ is the constructor?
>

I'm torn about that.  The fact is, that for 95% of the reasons you want 
to say "constructor", the thing you're describing is __init__.  Most 
classes have __init__, only very very few have __new__.

The sense that __new__ is the constructor is the one borrowed from C++ 
and Java: you don't have an instance of your type until the constructor 
has returned.  This is why __init__ is not a constructor: the self 
passed into __init__ is already an object of your class.

But that distinction isn't useful in most programs.  The thing most 
people mean by "constructor" is "the method that gets invoked right at 
the beginning of the object's lifetime, where you can add code to 
initialize it properly."  That describes __init__.

Insisting that __init__ is not a constructor makes about as much sense 
as insisting that "Python has no variables" just because they work 
differently than in C.  Python has variables, and it has constructors. 
We don't have to be tied to C++ semantics of the word "constructor" any 
more than we have to tied to its semantics of the word "variable" or "for".

Why can't we call __init__ the constructor and __new__ the allocator?

-- 
Ned Batchelder, http://nedbatchelder.com

[toc] | [next] | [standalone]


#65177

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-02-01 03:42 +0000
Message-ID<52ec6d1f$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#65137
On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:

> Why can't we call __init__ the constructor and __new__ the allocator?

__new__ constructs the object, and __init__ initialises it. What's wrong 
with calling them the constructor and initialiser? Is this such a 
difficult concept that the average programmer can't learn it?

I've met people who have difficulty with OOP principles, at least at 
first. But once you understand the idea of objects, it isn't that hard to 
understand the idea that:

- first, the object has to be created, or constructed, or allocated 
  if you will;

- only then can it be initialised.

Thus, two methods. __new__ constructs (creates, allocates) a new object; 
__init__ initialises it after the event.

(In hindsight, it was probably a mistake for Python to define two create-
an-object methods, although I expect it was deemed necessary for 
historical reasons. Most other languages make do with a single method, 
Objective-C being an exception with "alloc" and "init" methods.)



Earlier in this post, you wrote:

> But that distinction [between __new__ and __init__] isn't useful in
> most programs.

Well, I don't know about that. I guess it depends on what sort of objects 
you're creating. If you're creating immutable objects, then the 
distinction is vital. If you're subclassing from immutable built-ins, of 
which there are a few, the distinction may be important. If you're using 
the object-pool design pattern, the distinction is also vital. It's not 
*rare* to care about these things.


> The thing most people mean by "constructor" is "the method that gets
> invoked right at the beginning of the object's lifetime, where you can
> add code to initialize it properly."  That describes __init__.

"Most people". I presume you've done a statistically valid survey then 
*wink*

It *better* describes __new__, because it is *not true* that __init__ 
gets invoked "right at the beginning of the object's lifetime". Before 
__init__ is invoked, the object's lifetime has already begun, inside the 
call to __new__. Excluding metaclass shenanigans, the object lifetime 
goes:


Prior to the object existing:
- static method __new__ called on the class[1]
- __new__ creates the object[2]  <=== start of object lifetime

Within the object's lifetime:
- the rest of the __new__ method runs, which may perform arbitrarily
  complex manipulations of the object;
- __new__ exits, returning the object
- __init__ runs


So __init__ does not occur *right at the beginning*, and it is completely 
legitimate to write your classes using only __new__. You must use __new__ 
for immutable objects, and you may use __new__ for mutable ones. __init__ 
may be used by convention, but it is entirely redundant.

I do not buy the argument made by some people that Python ought to follow 
whatever (possibly inaccurate or misleading) terminology other languages 
use. Java and Ruby have the exact same argument passing conventions as 
Python, but one calls it "call by value" and the other "call by 
reference", and neither is the same meaning of "call by value/reference" 
as used by Pascal, C, Visual Basic, or other languages. So which 
terminology should Python use? Both C++ and Haskell have "functors", but 
they are completely different things. What Python calls a class method, 
Java calls a static method. We could go on for days, just listing 
differences in terminology.

In Python circles, using "constructor" for __new__ and "initialiser" for 
__init__ are well-established. In the context of Python, they make good 
sense: __new__ creates ("constructs") the object, and __init__ 
_init_ialises it. Missing the opportunity to link the method name 
__init__ to *initialise* would be a mistake.

We can decry the fact that computer science has not standardised on a 
sensible set of names for concepts, but on the other hand since the 
semantics of languages differ slightly, it would be more confusing to try 
to force all languages to use the same words for slightly different 
concepts.

The reality is, if you're coming to Python from another language, you're 
going to have to learn a whole lot of new stuff anyway, so having to 
learn a few language-specific terms is just a small incremental cost. And 
if you have no idea about other languages, then it is no harder to learn 
that __new__ / __init__ are the constructor/initialiser than it would be 
to learn that they are the allocator/constructor or preformulator/
postformulator.

I care about using the right terminology that will cause the least amount 
of cognitive dissonance to users' understanding of Python, not whether 
they have to learn new terminology, and in the context of Python's object 
module, "constructor" and "initialiser" best describe what __new__ and 
__init__ do.



[1] Yes, despite being declared with a "cls" parameter, __new__ is 
actually hard-coded as a static method.

[2] By explicitly or implicitly calling object.__new__.

-- 
Steven

[toc] | [prev] | [next] | [standalone]


#65181

FromChris Angelico <rosuav@gmail.com>
Date2014-02-01 15:35 +1100
Message-ID<mailman.6255.1391229320.18130.python-list@python.org>
In reply to#65177
On Sat, Feb 1, 2014 at 2:42 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> I've met people who have difficulty with OOP principles, at least at
> first. But once you understand the idea of objects, it isn't that hard to
> understand the idea that:
>
> - first, the object has to be created, or constructed, or allocated
>   if you will;
>
> - only then can it be initialised.
>
> Thus, two methods. __new__ constructs (creates, allocates) a new object;
> __init__ initialises it after the event.

Yes, but if you think in terms of abstractions, they're both just
steps in the conceptual process of "creating the object". If I ask GTK
to create me a Button, I don't care how many steps it has to go
through of allocating memory, allocating other resources, etc, etc,
etc. All I want is to be able to write:

foobar = Button("Foo Bar")

and, at the end of it, to have a Button that I can toss onto a window.
That's the job of a constructor - to give me an object in a state that
I can depend on. (And this is completely independent of language.
Modulo trivialities like semicolons, adorned names, etc, etc, I would
expect that this be valid in any object oriented GUI toolkit in any
object oriented language.)

The difference between __new__ and __init__ is important when you
write either method, but not when you use the class. It's like writing
other dunder methods. You care about the distinction between __add__
and __radd__ when you write the methods, but in all other code, all
that matters is that it does what you want:

three = Three()
four = three + 1
four == 1 + three

The two methods could have been done as a single method,
__construct__, in which you get passed a cls instead of a self, and
you call self=super().__construct__() and then initialize stuff. It
would then be obvious that this is "the constructor". So maybe it's
best to talk about the two methods collectively as "the constructor",
and then let people call the two parts whatever they will.

I do like the idea of calling __init__ the initializer. The downside
of calling __new__ the constructor is that it'll encourage C++ and
Java programmers to override it and get themselves confused, when
really they should have been writing __init__ and having an easy time
of it. So, current best suggestion is "allocator" for that? Not
entirely sure that's right, but it's not terrible. I'm +1 on __init__
being documented as the initializer, and +0 on "allocator" for
__new__.

ChrisA

[toc] | [prev] | [next] | [standalone]


#65199

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-02-01 05:51 +0000
Message-ID<52ec8b51$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#65181
On Sat, 01 Feb 2014 15:35:17 +1100, Chris Angelico wrote:

> On Sat, Feb 1, 2014 at 2:42 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> I've met people who have difficulty with OOP principles, at least at
>> first. But once you understand the idea of objects, it isn't that hard
>> to understand the idea that:
>>
>> - first, the object has to be created, or constructed, or allocated
>>   if you will;
>>
>> - only then can it be initialised.
>>
>> Thus, two methods. __new__ constructs (creates, allocates) a new
>> object; __init__ initialises it after the event.
> 
> Yes, but if you think in terms of abstractions, they're both just steps
> in the conceptual process of "creating the object". If I ask GTK to
> create me a Button, I don't care how many steps it has to go through of
> allocating memory, allocating other resources, etc, etc, etc.

You deleted the part of my post where I suggested that it's only a 
historical accident that Python has two methods for constructing an 
object when most OOP languages get by with one.


> All I want is to be able to write:
> 
> foobar = Button("Foo Bar")
> 
> and, at the end of it, to have a Button that I can toss onto a window.
> That's the job of a constructor - to give me an object in a state that I
> can depend on.

You're assuming your conclusion. Why is it the job of the constructor, 
rather than the initialiser? When I buy a house, it is fully constructed, 
but it's not yet usable -- there's no bed, no fridge, no furniture of any 
sort, just a bare shell with a roof and some built-in cupboards and 
perhaps an oven. I have to initialise the house myself with whatever 
furniture I need.

Even if it's a "fully furnished house", I still have to initialise it 
before I can say it's truly usable, even if that is merely emptying my 
suitcase into the built-in robes.

The curse of the discontinuous mind: we look for hard dividing lines 
between black and white when what we really have is shades of grey. It's 
easy to tell when an int is constructed and ready to use, and sure enough 
it uses __new__ and has a do-nothing __init__. But with mutable objects, 
say a list, when is it constructed and ready to use? Suppose you want to 
create a list of items and extract the third smallest item. When is the 
list constructed and ready to use?

- Is it when the memory is allocated for the object? Obviously not, since 
we can't do anything with it yet.

- How about when the object fields are set up and made consistent? (Array 
is blanked, length set to the correct value, pointer to the class set, 
etc.) This makes a good candidate, since this is the earliest that the 
object is in a consistent state.

- Is it when the list items are placed into the array? This is also a 
good candidate, since this is the earliest that the list has the items we 
expect. Assuming we expect any -- since many lists are created as simply 
[], then populated later, this isn't exactly black and white either.

- Or when it is sorted? Probably not here, although this is the earliest 
that the user can *actually* use the list for what they wanted it for, 
namely to extract the third largest value.

When does a pile of computer parts become a computer? When the CPU is 
plugged in? When the mouse is attached? Somewhere in between? We have 
difficulty drawing dividing lines because our minds are discontinuous but 
reality is continuous.

[...]
> The difference between __new__ and __init__ is important when you write
> either method, but not when you use the class. It's like writing other
> dunder methods. You care about the distinction between __add__ and
> __radd__ when you write the methods, but in all other code, all that
> matters is that it does what you want:

Yes. And your point is? Speaking as an end user, "call the constructor" 
to refer to:

instance = MyClass(arg)

is exactly right, because the constructor is called regardless of whether 
__new__ is called alone, or __new__ and __init__.


[...]
> The two methods could have been done as a single method, __construct__,
> in which you get passed a cls instead of a self, and you call
> self=super().__construct__() and then initialize stuff. 

That would be called __new__ in Python. There's no *need* to use __init__ 
for anything (except old-style classic classes in Python 2).


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#65205

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-01 00:28 -0800
Message-ID<mailman.6271.1391244676.18130.python-list@python.org>
In reply to#65199
On 01/31/2014 09:51 PM, Steven D'Aprano wrote:
> On Sat, 01 Feb 2014 15:35:17 +1100, Chris Angelico wrote:
>>
>> The two methods could have been done as a single method, __construct__,
>> in which you get passed a cls instead of a self, and you call
>> self=super().__construct__() and then initialize stuff.
>
> That would be called __new__ in Python. There's no *need* to use __init__
> for anything (except old-style classic classes in Python 2).

While there may not be a /need/ for two, having two is quite handy.  Having __new__ take care of the nuts and bolts (or 
foundation, as Terry put it), and being able to further customize with __init__ (where the kitchen goes, how many 
bedrooms, to follow along with Terry) is quite useful.  One of my favorite Enum recipes uses that pattern to have some 
basic behavior, with some other behavior that is easily overridable/extendable [1].

--
~Ethan~

[1] http://stackoverflow.com/q/19330460/208880

[toc] | [prev] | [next] | [standalone]


#65195

FromEthan Furman <ethan@stoneleaf.us>
Date2014-01-31 20:55 -0800
Message-ID<mailman.6266.1391233325.18130.python-list@python.org>
In reply to#65177
On 01/31/2014 08:35 PM, Chris Angelico wrote:
> On Sat, Feb 1, 2014 at 2:42 PM, Steven D'Aprano wrote:
>>
>> Thus, two methods. __new__ constructs (creates, allocates) a new object;
>> __init__ initialises it after the event.
>
> Yes, but if you think in terms of abstractions, they're both just
> steps in the conceptual process of "creating the object".

> So maybe it's best to talk about the two methods collectively as
> "the constructor", and then let people call the two parts whatever
> they will.

> I do like the idea of calling __init__ the initializer. The downside
> of calling __new__ the constructor is that it'll encourage C++ and
> Java programmers to override it and get themselves confused

Why do we worry so about other languages?  If and when I go to learn C++ or Lisp, I do not expect their devs to be 
worrying about making their terminology match Python's.  Part of the effort is in learning what the terms mean, what the 
ideology is, the differences, the similarities, etc., etc..

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#65211

FromNed Batchelder <ned@nedbatchelder.com>
Date2014-02-01 07:28 -0500
Message-ID<mailman.6275.1391257695.18130.python-list@python.org>
In reply to#65177
On 1/31/14 10:42 PM, Steven D'Aprano wrote:
> On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:
>
>> Why can't we call __init__ the constructor and __new__ the allocator?
>
> __new__ constructs the object, and __init__ initialises it. What's wrong
> with calling them the constructor and initialiser? Is this such a
> difficult concept that the average programmer can't learn it?
>
> I've met people who have difficulty with OOP principles, at least at
> first. But once you understand the idea of objects, it isn't that hard to
> understand the idea that:
>
> - first, the object has to be created, or constructed, or allocated
>    if you will;
>
> - only then can it be initialised.
>
> Thus, two methods. __new__ constructs (creates, allocates) a new object;
> __init__ initialises it after the event.
>
> (In hindsight, it was probably a mistake for Python to define two create-
> an-object methods, although I expect it was deemed necessary for
> historical reasons. Most other languages make do with a single method,
> Objective-C being an exception with "alloc" and "init" methods.)
>
>
>
> Earlier in this post, you wrote:
>
>> But that distinction [between __new__ and __init__] isn't useful in
>> most programs.
>
> Well, I don't know about that. I guess it depends on what sort of objects
> you're creating. If you're creating immutable objects, then the
> distinction is vital. If you're subclassing from immutable built-ins, of
> which there are a few, the distinction may be important. If you're using
> the object-pool design pattern, the distinction is also vital. It's not
> *rare* to care about these things.
>
>
>> The thing most people mean by "constructor" is "the method that gets
>> invoked right at the beginning of the object's lifetime, where you can
>> add code to initialize it properly."  That describes __init__.
>
> "Most people". I presume you've done a statistically valid survey then
> *wink*
>
> It *better* describes __new__, because it is *not true* that __init__
> gets invoked "right at the beginning of the object's lifetime". Before
> __init__ is invoked, the object's lifetime has already begun, inside the
> call to __new__. Excluding metaclass shenanigans, the object lifetime
> goes:
>
>
> Prior to the object existing:
> - static method __new__ called on the class[1]
> - __new__ creates the object[2]  <=== start of object lifetime
>
> Within the object's lifetime:
> - the rest of the __new__ method runs, which may perform arbitrarily
>    complex manipulations of the object;
> - __new__ exits, returning the object
> - __init__ runs
>
>
> So __init__ does not occur *right at the beginning*, and it is completely
> legitimate to write your classes using only __new__. You must use __new__
> for immutable objects, and you may use __new__ for mutable ones. __init__
> may be used by convention, but it is entirely redundant.
>
> I do not buy the argument made by some people that Python ought to follow
> whatever (possibly inaccurate or misleading) terminology other languages
> use. Java and Ruby have the exact same argument passing conventions as
> Python, but one calls it "call by value" and the other "call by
> reference", and neither is the same meaning of "call by value/reference"
> as used by Pascal, C, Visual Basic, or other languages. So which
> terminology should Python use? Both C++ and Haskell have "functors", but
> they are completely different things. What Python calls a class method,
> Java calls a static method. We could go on for days, just listing
> differences in terminology.
>
> In Python circles, using "constructor" for __new__ and "initialiser" for
> __init__ are well-established. In the context of Python, they make good
> sense: __new__ creates ("constructs") the object, and __init__
> _init_ialises it. Missing the opportunity to link the method name
> __init__ to *initialise* would be a mistake.
>
> We can decry the fact that computer science has not standardised on a
> sensible set of names for concepts, but on the other hand since the
> semantics of languages differ slightly, it would be more confusing to try
> to force all languages to use the same words for slightly different
> concepts.
>
> The reality is, if you're coming to Python from another language, you're
> going to have to learn a whole lot of new stuff anyway, so having to
> learn a few language-specific terms is just a small incremental cost. And
> if you have no idea about other languages, then it is no harder to learn
> that __new__ / __init__ are the constructor/initialiser than it would be
> to learn that they are the allocator/constructor or preformulator/
> postformulator.
>
> I care about using the right terminology that will cause the least amount
> of cognitive dissonance to users' understanding of Python, not whether
> they have to learn new terminology, and in the context of Python's object
> module, "constructor" and "initialiser" best describe what __new__ and
> __init__ do.
>

My summary of our two views is this:  I am trying to look at things from 
a typical programmer's point of view.  The existence of __new__ is an 
advanced topic that many programmers never encounter.  Taking a quick 
scan through some large projects (Django, edX, SQLAlchemy, mako), the 
ratio of __new__ implementations to __init__ implementations ranges from 
0% to 1.5%, which falls into "rare" territory for me.  Among programs 
less than 5000 lines long, I'm sure the number is indistinguishable from 
0, though I'm sure someone will question my methodology here as well! :)

You are looking at things from an accurate-down-to-the-last-footnote 
detailed point of view (and have provided some footnotes!).  That's a 
very valuable and important point of view.  It's just not how most 
programmers approach the language.

We are also both trying to reduce cognitive dissonance, but again, you 
are addressing language mavens who understand the footnotes, and I am 
trying to help the in-the-trenches people who have never encountered 
__new__ and are wondering why people are using funny words for the code 
they are writing.

Another difference in our approach: do you name things based on how they 
work under the hood, or how they are used?  I hope we can all agree that 
when writing a user-defined class, the code that in C++ or Java would go 
into the constructor, in Python typically goes in __init__.  When I say 
that __init__ plays the role of constructor, again, I mean from the 
typical programmer's point of view when writing typical user-defined 
classes.

Finding names for things is hard, and it's impossible to please both 
ends of this spectrum.

-- 
Ned Batchelder, http://nedbatchelder.com

[toc] | [prev] | [next] | [standalone]


#65218

FromRoy Smith <roy@panix.com>
Date2014-02-01 09:40 -0500
Message-ID<roy-644DCB.09404301022014@news.panix.com>
In reply to#65211
In article <mailman.6275.1391257695.18130.python-list@python.org>,
 Ned Batchelder <ned@nedbatchelder.com> wrote:

> The existence of __new__ is an 
> advanced topic that many programmers never encounter.  Taking a quick 
> scan through some large projects (Django, edX, SQLAlchemy, mako), the 
> ratio of __new__ implementations to __init__ implementations ranges from 
> 0% to 1.5%, which falls into "rare" territory for me. 

From our own codebase:

$ find . -name '*.py' | xargs grep 'def.*__new__' | wc -l
1
$ find . -name '*.py' | xargs grep 'def.*__init__' | wc -l
228

Doing the same searches over all the .py files in our virtualenv, I get 
2830 (__init__) vs. 50 (__new__).

[toc] | [prev] | [next] | [standalone]


#65219

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-02-01 15:07 +0000
Message-ID<mailman.6280.1391267257.18130.python-list@python.org>
In reply to#65218
On 01/02/2014 14:40, Roy Smith wrote:
> In article <mailman.6275.1391257695.18130.python-list@python.org>,
>   Ned Batchelder <ned@nedbatchelder.com> wrote:
>
>> The existence of __new__ is an
>> advanced topic that many programmers never encounter.  Taking a quick
>> scan through some large projects (Django, edX, SQLAlchemy, mako), the
>> ratio of __new__ implementations to __init__ implementations ranges from
>> 0% to 1.5%, which falls into "rare" territory for me.
>
>  From our own codebase:
>
> $ find . -name '*.py' | xargs grep 'def.*__new__' | wc -l
> 1
> $ find . -name '*.py' | xargs grep 'def.*__init__' | wc -l
> 228
>
> Doing the same searches over all the .py files in our virtualenv, I get
> 2830 (__init__) vs. 50 (__new__).
>

You could remove all 228 __init__ and still get your code to work by 
scattering object attributes anywhere you like, something I believe you 
can't do in C++/Java.  I doubt that you could remove the single __new__ 
and get your code to work.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#65226

FromRoy Smith <roy@panix.com>
Date2014-02-01 11:17 -0500
Message-ID<roy-0EAC31.11174001022014@news.panix.com>
In reply to#65219
> On 01/02/2014 14:40, Roy Smith wrote:
> > In article <mailman.6275.1391257695.18130.python-list@python.org>,
> >   Ned Batchelder <ned@nedbatchelder.com> wrote:
> >
> >> The existence of __new__ is an
> >> advanced topic that many programmers never encounter.  Taking a quick
> >> scan through some large projects (Django, edX, SQLAlchemy, mako), the
> >> ratio of __new__ implementations to __init__ implementations ranges from
> >> 0% to 1.5%, which falls into "rare" territory for me.
> >
> >  From our own codebase:
> >
> > $ find . -name '*.py' | xargs grep 'def.*__new__' | wc -l
> > 1
> > $ find . -name '*.py' | xargs grep 'def.*__init__' | wc -l
> > 228
> >
> > Doing the same searches over all the .py files in our virtualenv, I get
> > 2830 (__init__) vs. 50 (__new__).

In article <mailman.6280.1391267257.18130.python-list@python.org>,
 Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:

> You could remove all 228 __init__ and still get your code to work by 
> scattering object attributes anywhere you like, something I believe you 
> can't do in C++/Java.

Why not?  Here's a simple C++ program which uses a constructor:

#include <stdio.h>

class Foo {
public:
    int i;

    Foo() : i(42) {}
};

int main(int, char**) {
    Foo foo;
    printf("foo.i = %d\n", foo.i);
}

If I wanted to, I could remove the constructor and still "get my code to 
work by scattering object attributes anywhere I like":

#include <stdio.h>

class Foo {
public:
    int i;
};

int main(int, char**) {
    Foo foo;
    foo.i = 42;
    printf("foo.i = %d\n", foo.i);
}

> I doubt that you could remove the single __new__ and get your code to work.

Perhaps.  Looking at our own code, the one place we use __new__ is in a 
metaclass, which I think pretty well reinforces Ned's assertion that 
__new__ is an advanced topic.

[toc] | [prev] | [next] | [standalone]


#65236

FromTim Delaney <timothy.c.delaney@gmail.com>
Date2014-02-02 07:09 +1100
Message-ID<mailman.6288.1391285364.18130.python-list@python.org>
In reply to#65177

[Multipart message — attachments visible in raw view] — view raw

On 1 February 2014 23:28, Ned Batchelder <ned@nedbatchelder.com> wrote:
>
> You are looking at things from an accurate-down-to-the-last-footnote
> detailed point of view (and have provided some footnotes!).  That's a very
> valuable and important point of view.  It's just not how most programmers
> approach the language.
>

This is the *language reference* that is being discussed. It documents the
intended semantics of the language. We most certainly should strive to
ensure that it is accurate-down-to-the-last-footnote - any difference
between the reference documentation and the implementation is a bug in
either the documentation or the implementation.

Tim Delaney

[toc] | [prev] | [next] | [standalone]


#65248

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-02-02 01:28 +0000
Message-ID<52ed9f46$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#65236
On Sun, 02 Feb 2014 07:09:14 +1100, Tim Delaney wrote:

> On 1 February 2014 23:28, Ned Batchelder <ned@nedbatchelder.com> wrote:
>>
>> You are looking at things from an accurate-down-to-the-last-footnote
>> detailed point of view (and have provided some footnotes!).  That's a
>> very valuable and important point of view.  It's just not how most
>> programmers approach the language.
>>
>>
> This is the *language reference* that is being discussed. It documents
> the intended semantics of the language. We most certainly should strive
> to ensure that it is accurate-down-to-the-last-footnote - any difference
> between the reference documentation and the implementation is a bug in
> either the documentation or the implementation.

+1


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#65251

FromBen Finney <ben+python@benfinney.id.au>
Date2014-02-02 15:27 +1100
Message-ID<mailman.6298.1391315271.18130.python-list@python.org>
In reply to#65177
Ned Batchelder <ned@nedbatchelder.com> writes:

> My summary of our two views is this: I am trying to look at things
> from a typical programmer's point of view.

Do you think the typical programmer will be looking in the language
reference? I don't.

> The existence of __new__ is an advanced topic that many programmers
> never encounter.

But when they do, the language reference had better be very clear on the
purpose of “__init__” and “__new__”.

> You are looking at things from an accurate-down-to-the-last-footnote
> detailed point of view (and have provided some footnotes!). That's a
> very valuable and important point of view. It's just not how most
> programmers approach the language.

Won't most programmers approach the language through (a) some example
code, (b) the tutorial, (c) the library reference? Those are appropriate
places for helpful simplifications and elisions.

> We are also both trying to reduce cognitive dissonance, but again, you
> are addressing language mavens who understand the footnotes, and I am
> trying to help the in-the-trenches people who have never encountered
> __new__ and are wondering why people are using funny words for the
> code they are writing.

Then I think your attempt to sacrifice precise terminology in the
langauge reference is misplaced. The in-the-trenches people won't see
it; and, when they go looking for it, they're likely to be wanting exact
language-maven-directed specifications.

> Finding names for things is hard, and it's impossible to please both
> ends of this spectrum.

Very true. That's why we have different documents for different
audiences. But yes, the terminology needs to hold up for both ends of
the spectrum, and naming is difficult.

-- 
 \         “If nature has made any one thing less susceptible than all |
  `\    others of exclusive property, it is the action of the thinking |
_o__)                          power called an idea” —Thomas Jefferson |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#65296

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2014-02-03 12:38 +1300
Message-ID<bl836qFh9baU1@mid.individual.net>
In reply to#65177
Steven D'Aprano wrote:
> (In hindsight, it was probably a mistake for Python to define two create-
> an-object methods, although I expect it was deemed necessary for 
> historical reasons.

I'm not sure that all of the reasons are historical. Languages
that have a single creation/initialisation method also usually
have a mechanism for automatically calling a base version
of the method if you don't do that explicitly, and they
typically do it by statically analysing the source. That's
not so easy in a dynamic language.

If Python only had __new__, everyone who overrode it would
have to start with an explicit call to the base class's
__new__, adding a lot of boilerplate and forcing people
to learn how to make base method calls much sooner than
they would otherwise need to.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#65302

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-02-03 00:33 +0000
Message-ID<52eee3c5$0$29972$c3e8da3$5496439d@news.astraweb.com>
In reply to#65296
On Mon, 03 Feb 2014 12:38:00 +1300, Gregory Ewing wrote:

> Steven D'Aprano wrote:
>> (In hindsight, it was probably a mistake for Python to define two
>> create- an-object methods, although I expect it was deemed necessary
>> for historical reasons.
> 
> I'm not sure that all of the reasons are historical. Languages that have
> a single creation/initialisation method also usually have a mechanism
> for automatically calling a base version of the method if you don't do
> that explicitly, and they typically do it by statically analysing the
> source. That's not so easy in a dynamic language.

Just because statically-typed languages do it at compile-time doesn't 
mean Python couldn't do it at run-time. All the information is readily 
available, so Python could do something like this:

# --- Untested ---
# Automatically call each __new__ constructor method, starting from
# the most fundamental (object) and ending with the current class.
stack = []
for c in cls.__mro__:
    if hasattr(c, '__new__'):
        stack.append(c.__new__)
while stack:
    stack.pop()(*args)


Note that this design is sub-optimal: the constructor methods don't 
receive the newly-created instance as an argument, which makes it hard to 
do initialisation, and makes the whole exercise rather pointless. But 
with a slight change of semantics, we can make this work rather sensibly. 
Change the signature of __new__ to:

def __new__(cls, self=None, *args, **kwargs)

and the last two lines to:

instance = None
while stack:
    instance = stack.pop()(instance, *args)


Is this a good design? Possibly not. But it's possible, and not terribly 
hard. Dynamism is no barrier to automatically calling constructors.

What I meant by backwards compatibility is that prior to the introduction 
of new-style classes, you couldn't override __new__, only __init__. So if 
you had a classic class, you'd have to receive the instance:

class Classic:
    def __init__(self, *args):
        ...


but for new-style classes, you'd receive the class:

class Newstyle(object):
    def __init__(cls, *args):
        ...


which is confusing and awkward, and would make it annoying to migrate 
from classic classes to new-style classes. So from the backwards-
compatibility perspective, __init__ has to receive the instance (self) as 
first argument. So the simplest way to satisfy that requirement, and 
still allow the class to define a constructor method that receives the 
class and constructs the instance, is to define a second special method. 
Which is what was done.


> If Python only had __new__, everyone who overrode it would have to start
> with an explicit call to the base class's __new__, adding a lot of
> boilerplate and forcing people to learn how to make base method calls
> much sooner than they would otherwise need to.

There is that as well.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#65391

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2014-02-04 12:47 +1300
Message-ID<blao42F4510U1@mid.individual.net>
In reply to#65302
Steven D'Aprano wrote:
> # --- Untested ---
> # Automatically call each __new__ constructor method, starting from
> # the most fundamental (object) and ending with the current class.
> stack = []
> for c in cls.__mro__:
>     if hasattr(c, '__new__'):
>         stack.append(c.__new__)
> while stack:
>     stack.pop()(*args)

That sort of thing doesn't allow for passing different
arguments to the base __new__.

Also remember that in Python, __new__ isn't actually
required to call a matching base version at all -- it's
legal to do something quite different, such as returning
a cached instance or instantiating a different type of
object altogether.

> What I meant by backwards compatibility is that prior to the introduction 
> of new-style classes, you couldn't override __new__, only __init__. So if 
> you had a classic class, you'd have to receive the instance:
> 
> class Classic:
>     def __init__(self, *args):
>         ...
> 
> but for new-style classes, you'd receive the class:
> 
> class Newstyle(object):
>     def __init__(cls, *args):
>         ...

What I'm saying is that even if all classes had been
new-style from the beginning, it's by no means certain that
Guido wouldn't have come up with the __new__/__init__
system anyway, because of the flexibility it provides.

Python isn't the only language to do something like that.
In Objective-C, the basic pattern for instantiating an
object goes like

    [[SomeClass alloc] init: args]

where 'alloc' allocates memory for the object and
'init:' or some variation thereof initialises it.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#65299

FromTim Delaney <timothy.c.delaney@gmail.com>
Date2014-02-03 11:02 +1100
Message-ID<mailman.6314.1391385741.18130.python-list@python.org>
In reply to#65177

[Multipart message — attachments visible in raw view] — view raw

On 1 February 2014 14:42, Steven D'Aprano <
steve+comp.lang.python@pearwood.info> wrote:

> On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:
>
> (In hindsight, it was probably a mistake for Python to define two create-
> an-object methods, although I expect it was deemed necessary for
> historical reasons. Most other languages make do with a single method,
> Objective-C being an exception with "alloc" and "init" methods.)
>

I disagree. In nearly every language I've used which only has single-phase
construction, I've wished for two-phase construction. By the time you get
to __init__ you know the following things about the instance:

1. It is a complete instance of the subclass - there's no part of the
structure that is invalid to access (of course, many attributes might not
yet exist).

2. Calling a method from __init__ will call the subclass' method. This
allows subclasses to hook into the initisation process by overriding
methods (of course, the subclass will need to ensure it has initialised all
the state it needs). This is generally not allowed in languages with
single-phase construction because the object is in an intermediate state.

For example,  in C++ the vtable is for the class currently being
constructed, not the subclass, so it will always call the current class'
implementation of the method.

In Java you can actually call the subclass' implementation, but in that
case it will call the subclass method before the subclass constructor is
actually run, meaning that instance variables will have their default
values (null for objects). When the base class constructor is eventually
run the instance variables will be assigned the values in the class
definition (replacing anything set by the subclass method call).

Tim Delaney

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web