Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #107040 > unrolled thread
| Started by | cshintov@gmail.com |
|---|---|
| First post | 2016-04-15 03:25 -0700 |
| Last post | 2016-04-15 09:27 -0500 |
| Articles | 5 — 5 participants |
Back to article view | Back to comp.lang.python
Python garbage collection: not releasing memory to OS! cshintov@gmail.com - 2016-04-15 03:25 -0700
Re: Python garbage collection: not releasing memory to OS! Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-04-15 08:24 -0400
Re: Python garbage collection: not releasing memory to OS! Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-04-15 14:02 +0100
Re: Python garbage collection: not releasing memory to OS! Michael Torrie <torriem@gmail.com> - 2016-04-15 07:59 -0600
Re: Python garbage collection: not releasing memory to OS! Sam <python@net153.net> - 2016-04-15 09:27 -0500
| From | cshintov@gmail.com |
|---|---|
| Date | 2016-04-15 03:25 -0700 |
| Subject | Python garbage collection: not releasing memory to OS! |
| Message-ID | <215cf2a4-c2fa-41d0-8c49-23b9494234d4@googlegroups.com> |
I have written an application with flask and uses celery for a long running task. While load testing I noticed that the celery tasks are not releasing memory even after completing the task. So I googled and found this group discussion..
https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw
In that discussion it says, thats how python works.
Also the article at https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says
"But from the OS's perspective, your program's size is the total (maximum) memory allocated to Python. Since Python returns memory to the OS on the heap (that allocates other objects than small objects) only on Windows, if you run on Linux, you can only see the total memory used by your program increase."
And I use Linux. So I wrote the below script to verify it.
import gc
def memory_usage_psutil():
# return the memory usage in MB
import resource
print 'Memory usage: %s (MB)' % (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0)
def fileopen(fname):
memory_usage_psutil()# 10 MB
f = open(fname)
memory_usage_psutil()# 10 MB
content = f.read()
memory_usage_psutil()# 14 MB
def fun(fname):
memory_usage_psutil() # 10 MB
fileopen(fname)
gc.collect()
memory_usage_psutil() # 14 MB
import sys
from time import sleep
if __name__ == '__main__':
fun(sys.argv[1])
for _ in range(60):
gc.collect()
memory_usage_psutil()#14 MB ...
sleep(1)
The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it.
So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.**
[toc] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2016-04-15 08:24 -0400 |
| Message-ID | <mailman.20.1460723049.6324.python-list@python.org> |
| In reply to | #107040 |
On Fri, 15 Apr 2016 03:25:39 -0700 (PDT), cshintov@gmail.com declaimed the
following:
>So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.**
How would you do that from any other process? If it's easy, I'd think
the Python devs would have considered adding it to the run-time.
IOWs, the question is more likely to be "how does the OS reclaim memory
from running processes". In many operating systems -- it only does that
when cleaning up the entire process space. Memory allocated for the heap
seldom is reclaimed piecemeal as heap objects may be scattered throughout
the allocation and can't be moved.
No idea how Windows could be different. Possibly it only counts pages
that actually have live data (how it determines that I don't know -- maybe
malloc/free use a system-wide pool of virtual memory so when a large enough
block of adjacent addresses is freed the entire page is returned to the
system, even though the process thinks it is using a contiguous allotment).
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2016-04-15 14:02 +0100 |
| Message-ID | <mailman.23.1460725372.6324.python-list@python.org> |
| In reply to | #107040 |
On 15 April 2016 at 11:25, <cshintov@gmail.com> wrote: > The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it. When exactly memory gets freed to the OS is unclear but it's possible that your process can reuse the same bits of memory. The real question is whether continuously allocating and deallocating leads to steadily growing memory usage. If you change it so that your code calls fun inside the loop you will see that repeatedly calling fun does not lead to growing memory usage. > So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.** I don't really understand what you're asking here. You're running celery in a subprocess right? Is the problem about the memory used by subprocesses that aren't killed or is it the memory usage of the Python process? -- Oscar
[toc] | [prev] | [next] | [standalone]
| From | Michael Torrie <torriem@gmail.com> |
|---|---|
| Date | 2016-04-15 07:59 -0600 |
| Message-ID | <mailman.25.1460729198.6324.python-list@python.org> |
| In reply to | #107040 |
On 04/15/2016 04:25 AM, cshintov@gmail.com wrote: > The input was a 4MB file. Even after returning from the 'fileopen' > function the 4MB memory was not released. I checked htop output while > the loop was running, the resident memory stays at 14MB. So unless > the process is stopped the memory stays with it. I guess the question is, why is this a problem? If there are no leaks, then I confess I don't understand what your concern is. And indeed you say it's not leaking as it never rises above 14 MB. Also there are ways of reading a file without allocating huge amounts of memory. Why not read it in in chunks, or in lines. Take advantage of Python's generator facilities to process your data. > So if the celery worker is not killed after its task is finished it > is going to keep the memory for itself. I know I can use > **max_tasks_per_child** config value to kill the process and spawn a > new one. **Is there any other way to return the memory to OS from a > python process?.** Have you tried using the subprocess module of python? If I understand it correctly, this would allow you to run python code as a subprocess (completely separate process), which would be completely reaped by the OS when it's finished.
[toc] | [prev] | [next] | [standalone]
| From | Sam <python@net153.net> |
|---|---|
| Date | 2016-04-15 09:27 -0500 |
| Message-ID | <mailman.27.1460731107.6324.python-list@python.org> |
| In reply to | #107040 |
On 04/15/2016 05:25 AM, cshintov@gmail.com wrote: > I have written an application with flask and uses celery for a long running task. While load testing I noticed that the celery tasks are not releasing memory even after completing the task. So I googled and found this group discussion.. > > https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw > > In that discussion it says, thats how python works. > > Also the article at https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says > > "But from the OS's perspective, your program's size is the total (maximum) memory allocated to Python. Since Python returns memory to the OS on the heap (that allocates other objects than small objects) only on Windows, if you run on Linux, you can only see the total memory used by your program increase." > > And I use Linux. So I wrote the below script to verify it. > > import gc > def memory_usage_psutil(): > # return the memory usage in MB > import resource > print 'Memory usage: %s (MB)' % (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0) > > def fileopen(fname): > memory_usage_psutil()# 10 MB > f = open(fname) > memory_usage_psutil()# 10 MB > content = f.read() > memory_usage_psutil()# 14 MB > > def fun(fname): > memory_usage_psutil() # 10 MB > fileopen(fname) > gc.collect() > memory_usage_psutil() # 14 MB > > import sys > from time import sleep > if __name__ == '__main__': > fun(sys.argv[1]) > for _ in range(60): > gc.collect() > memory_usage_psutil()#14 MB ... > sleep(1) > > The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it. > > So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.** > With situations like this, I normally just fork and do the mem intensive work in the child and then kill it off when done. Might be able to use a thread instead of a fork. But not sure how well all that would work with celery. --Sam
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web