Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #67959 > unrolled thread
| Started by | Chris Angelico <rosuav@gmail.com> |
|---|---|
| First post | 2014-03-07 09:28 +1100 |
| Last post | 2014-03-07 02:31 +0200 |
| Articles | 8 — 2 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: script uses up all memory Chris Angelico <rosuav@gmail.com> - 2014-03-07 09:28 +1100
Re: script uses up all memory Marko Rauhamaa <marko@pacujo.net> - 2014-03-07 00:34 +0200
Re: script uses up all memory Chris Angelico <rosuav@gmail.com> - 2014-03-07 09:43 +1100
Re: script uses up all memory Marko Rauhamaa <marko@pacujo.net> - 2014-03-07 01:12 +0200
Re: script uses up all memory Chris Angelico <rosuav@gmail.com> - 2014-03-07 10:31 +1100
Re: script uses up all memory Marko Rauhamaa <marko@pacujo.net> - 2014-03-07 01:53 +0200
Re: script uses up all memory Chris Angelico <rosuav@gmail.com> - 2014-03-07 11:11 +1100
Re: script uses up all memory Marko Rauhamaa <marko@pacujo.net> - 2014-03-07 02:31 +0200
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-07 09:28 +1100 |
| Subject | Re: script uses up all memory |
| Message-ID | <mailman.7879.1394144925.18130.python-list@python.org> |
On Fri, Mar 7, 2014 at 9:21 AM, Larry Martell <larry.martell@gmail.com> wrote: > First I added del(self.tools) before the Django call. That did not > stop the memory consumption. Then I added a call to gc.collect() after > the del and that did solve it. gc.collect() returns 0 each time, so > I'm going to declare victory and move on. No time to dig into the > Django code. Thanks. Not all problems need to be solved perfectly :) But at very least, I would put a comment against your collect() call explaining what happens: that self.tools is involved in a refloop. Most Python code shouldn't have to call gc.collect(), so it's worth explaining why you are here. ChrisA
[toc] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-07 00:34 +0200 |
| Message-ID | <87ha7bq7ls.fsf@elektro.pacujo.net> |
| In reply to | #67959 |
Chris Angelico <rosuav@gmail.com>: > Not all problems need to be solved perfectly :) But at very least, I > would put a comment against your collect() call explaining what > happens: that self.tools is involved in a refloop. Most Python code > shouldn't have to call gc.collect(), so it's worth explaining why you > are here. Refloops also are nothing to be avoided. Let GC do its job and forget about it. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-07 09:43 +1100 |
| Message-ID | <mailman.7880.1394145824.18130.python-list@python.org> |
| In reply to | #67960 |
On Fri, Mar 7, 2014 at 9:34 AM, Marko Rauhamaa <marko@pacujo.net> wrote: > Chris Angelico <rosuav@gmail.com>: > >> Not all problems need to be solved perfectly :) But at very least, I >> would put a comment against your collect() call explaining what >> happens: that self.tools is involved in a refloop. Most Python code >> shouldn't have to call gc.collect(), so it's worth explaining why you >> are here. > > Refloops also are nothing to be avoided. Let GC do its job and forget > about it. I think this thread is proof that they are to be avoided. The GC wasn't doing its job unless explicitly called on. The true solution is to break the refloop; the quick fix is to call gc.collect(). I stand by the recommendation to put an explanatory comment against the collect call. [1] ChrisA [1] Here in Australia, that should be gc.reverse_charges().
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-07 01:12 +0200 |
| Message-ID | <87d2hyrkf4.fsf@elektro.pacujo.net> |
| In reply to | #67961 |
Chris Angelico <rosuav@gmail.com>: > I think this thread is proof that they are to be avoided. The GC > wasn't doing its job unless explicitly called on. The true solution is > to break the refloop; the quick fix is to call gc.collect(). I stand > by the recommendation to put an explanatory comment against the > collect call. What I'm saying is that under most circumstances you shouldn't care if the memory consumption goes up and down. The true solution is to not do anything about temporary memory consumption. Also, you shouldn't worry about breaking circular references. That is also often almost impossible to accomplish as so much modern code builds on closures, which generate all kinds of circular references under the hood—for your benefit, or course. Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-07 10:31 +1100 |
| Message-ID | <mailman.7881.1394148716.18130.python-list@python.org> |
| In reply to | #67962 |
On Fri, Mar 7, 2014 at 10:12 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> I think this thread is proof that they are to be avoided. The GC
>> wasn't doing its job unless explicitly called on. The true solution is
>> to break the refloop; the quick fix is to call gc.collect(). I stand
>> by the recommendation to put an explanatory comment against the
>> collect call.
>
> What I'm saying is that under most circumstances you shouldn't care if
> the memory consumption goes up and down. The true solution is to not do
> anything about temporary memory consumption. Also, you shouldn't worry
> about breaking circular references. That is also often almost impossible
> to accomplish as so much modern code builds on closures, which generate
> all kinds of circular references under the hood—for your benefit, or
> course.
This isn't a temporary issue, though - see the initial post. After two
hours of five-minutely checks, the computer was wedged. That's a
problem to be solved.
Most of what I do with closures can't create refloops, because the
function isn't referenced from inside itself. You'd need something
like this:
>>> def foo():
x=1
y=lambda: (x,y)
return y
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000
That's repeatably creating garbage. But change the function to not
return itself, and there's no loop:
>>> def foo():
x=1
y=lambda: x
return y
>>> gc.collect()
0
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
0
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
0
The only even reasonably common case that I can think of is a
recursive nested function:
>>> def foo(x):
def y(f,x=x):
f()
for _ in range(x): y(f,x-1)
return y
It's a function that returns a function that calls its argument some
number of times, where the number is derived in a stupid way from the
argument to the first function. The whole function is garbage, so it's
not surprising that the GC has to collect it.
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135
Can you give a useful example of a closure that does create a refloop?
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-07 01:53 +0200 |
| Message-ID | <878usmrijk.fsf@elektro.pacujo.net> |
| In reply to | #67963 |
Chris Angelico <rosuav@gmail.com>:
> Can you give a useful example of a closure that does create a refloop?
Just the other day, I mentioned the state pattern:
class MyStateMachine:
def __init__(self):
sm = self
class IDLE:
def ding(self):
sm.open_door()
sm.state = AT_DOOR()
class AT_DOOR:
...
self.state = IDLE()
def ding(self):
self.state.ding()
So we have:
MyStateMachine instance
-> MyStateMachine instance.ding
-> IDLE instance
-> IDLE instance.ding
-> MyStateMachine instance
plus numerous others in this example alone.
In general, event-driven programming produces circular references left
and right, and that might come into wider use with asyncio.
I suspect generators might create circular references as well.
Any tree data structure with parent references creates cycles.
In fact, I would imagine most OO designs create a pretty tight mesh of
back-and-forth references.
Marko
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-03-07 11:11 +1100 |
| Message-ID | <mailman.7883.1394151084.18130.python-list@python.org> |
| In reply to | #67964 |
On Fri, Mar 7, 2014 at 10:53 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> Chris Angelico <rosuav@gmail.com>:
>
>> Can you give a useful example of a closure that does create a refloop?
>
> Just the other day, I mentioned the state pattern:
>
> class MyStateMachine:
> def __init__(self):
> sm = self
>
> class IDLE:
> def ding(self):
> sm.open_door()
> sm.state = AT_DOOR()
Yeah, that's an extremely unusual way to do things. Why keep on
instantiating objects when you could just reference functions?
> In general, event-driven programming produces circular references left
> and right, and that might come into wider use with asyncio.
Nope; certainly not with closures. I do a whole lot of event-driven
programming (usually in Pike rather than Python, but they work the
same way in this), and there's no reference loop. Properly-done
event-driven programming should have two basic states: a reference
from some invisible thing that can trigger the event (eg a GUI widget)
to a callable, and a reference from that callable to its state. Once
the trigger is gone, the callable is dropped, its state is dropped,
and everything's cleaned up. You don't usually need a reference inside
the function to that function.
Don't forget, a closure need only hang onto the things it actually
uses. It doesn't need all its locals.
> I suspect generators might create circular references as well.
I doubt it.
>>> def foo(x):
return ("x"*i for i in range(x))
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
0
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
0
Again, unless it keeps a reference to itself, there's no loop. It'll
need to hang onto some of its locals, but that's all.
> Any tree data structure with parent references creates cycles.
Yes, but how many of those do you actually have and drop? If you
create a GUI, you generally hold your entire widget tree stably. The
only issue is if you create a parent-child subtree and then drop it.
That shouldn't be being done in a tight loop. Most of the classic data
structures like trees are implemented at the C level, so again, your
code shouldn't be concerning itself with that.
> In fact, I would imagine most OO designs create a pretty tight mesh of
> back-and-forth references.
Examples, please? I can think of a handful of situations where I've
created reference loops, and they're sufficiently rare that I can put
comments against them and explicitly break them. For instance, I have
a "Subwindow" that has a "Connection". My window can have multiple
subwindows, a subwindow may or may not have a connection, and the
connection always references its subwindow. The subw->connection->subw
loop is explicitly broken when the connection is terminated. If the
window chooses to drop a subw, it first checks if there's a connection
(and prompts the user to confirm), and then will explicitly
disconnect, which breaks the refloop (as the connection's terminated).
I did a similar thing at work, again with explicit refloop breakage to
ensure clean removal. Apart from those two cases, I can't think of
anything in the last ten years where I've had a data structure with a
loop in it, where the whole loop could be dropped. (My MUD has a loop,
in that a character exists in a room, and the room keeps track of its
contents; but it's not logical to drop a room with characters in it,
and dropping a character is done by moving it to no-room, which breaks
the refloop.)
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-03-07 02:31 +0200 |
| Message-ID | <874n3argry.fsf@elektro.pacujo.net> |
| In reply to | #67966 |
Chris Angelico <rosuav@gmail.com>:
> On Fri, Mar 7, 2014 at 10:53 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> class MyStateMachine:
>> def __init__(self):
>> sm = self
>>
>> class IDLE:
>> def ding(self):
>> sm.open_door()
>> sm.state = AT_DOOR()
>
> Yeah, that's an extremely unusual way to do things. Why keep on
> instantiating objects when you could just reference functions?
That's not crucial. Even if the state objects were instantiated and
inner classes not used, you'd get the same circularity:
class State:
def __init__(self, sm):
self.sm = sm
class Idle(State):
def ding(self):
self.sm.open_door()
self.sm.state = self.sm.AT_DOOR
class AtDoor(state):
...
class MyStateMachine:
def __init__(self):
self.IDLE = Idle(self)
self.AT_DOOR = AtDoor(self)
...
self.state = self.IDLE
The closure style is more concise and to the point and might perform no
worse.
> Nope; certainly not with closures. I do a whole lot of event-driven
> programming (usually in Pike rather than Python, but they work the
> same way in this), and there's no reference loop. Properly-done
> event-driven programming should have two basic states: a reference
> from some invisible thing that can trigger the event (eg a GUI widget)
> to a callable, and a reference from that callable to its state. Once
> the trigger is gone, the callable is dropped, its state is dropped,
> and everything's cleaned up. You don't usually need a reference inside
> the function to that function.
I'm more familiar with networking. If you need a timer, you need to be
able to start it so you need a reference to it. Ok, maybe you
instantiate a new timer each time, but you may need to cancel the timer
so starting the timer gives you a ticket you can use for canceling.
Similarly, you need a socket (wrapper) to signal an I/O state change,
and you also need to be able to close the socket at a bare minimum.
The task scheduling service (asyncio has one) collects thunks that refer
to your objects and your objects have a reference to the task scheduling
service to be able to schedule new tasks.
> Don't forget, a closure need only hang onto the things it actually
> uses. It doesn't need all its locals.
More importantly, there's nothing bad in circularity. No need to avoid
it. No need to cut cords.
Marko
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web