Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #46520

usage of os.posix_fadvise

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.005
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'cache': 0.07; 'memory.': 0.07; 'parameter': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'seemed': 0.09; 'useful,': 0.14; "'rb')": 0.16; '(note': 0.16; '4gb': 0.16; '>the': 0.16; 'antoine': 0.16; 'cached': 0.16; 'caching': 0.16; 'chunk_size': 0.16; 'discarded': 0.16; 'flags,': 0.16; 'message-id:@post.gmane.org': 0.16; 'occurs,': 0.16; 'received:80.91.229.3': 0.16; 'received:mediaways.net': 0.16; 'received:plane.gmane.org': 0.16; 'received:pool.mediaways.net': 0.16; 'semantics': 0.16; 'skip:> 20': 0.16; 'subject:usage': 0.16; 'wrote:': 0.18; 'all,': 0.19; 'bit': 0.19; 'module': 0.19; 'trying': 0.19; '>>>': 0.22; 'memory': 0.22; 'import': 0.22; 'print': 0.22; 'header:User- Agent:1': 0.23; "haven't": 0.24; 'holds': 0.26; 'this:': 0.26; 'read,': 0.26; 'header:X-Complaints-To:1': 0.27; 'tried': 0.27; "doesn't": 0.30; 'streaming': 0.30; "i'm": 0.30; 'once,': 0.31; 'piece': 0.31; 'pos': 0.31; 'writes:': 0.31; 'file': 0.32; 'probably': 0.32; 'linux': 0.33; 'used,': 0.33; 'moment': 0.34; 'maybe': 0.34; 'could': 0.34; 'test': 0.35; 'but': 0.35; 'version': 0.36; 'really': 0.36; 'data,': 0.36; 'ubuntu': 0.36; 'wishes,': 0.36; 'done': 0.36; 'useful': 0.36; 'charset:us-ascii': 0.36; 'thanks': 0.36; 'should': 0.36; 'example,': 0.37; 'performance': 0.37; 'skip:o 20': 0.38; 'system,': 0.38; 'to:addr :python-list': 0.38; 'anything': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'skip:p 20': 0.39; 'received:org': 0.40; 'called': 0.40; 'how': 0.40; 'read': 0.60; 'free': 0.61; 'new': 0.61; 'range': 0.61; "you're": 0.61; 'first': 0.61; 'confirm': 0.64; 'more': 0.64; 'great': 0.65; 'dear': 0.65; 'periodically': 0.68; 'overall': 0.69; 'subjectcharset:utf-8': 0.72; 'carefully': 0.74; 'now:': 0.74; '(30': 0.84; '(still': 0.84; '2gb': 0.84; 'answer:': 0.84; 'p.s.:': 0.84; 'subject:skip:o 10': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Subject usage of os.posix_fadvise
Date Thu, 30 May 2013 17:54:12 +0000 (UTC)
Mime-Version 1.0
Content-Type text/plain; charset=us-ascii
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host sea.gmane.org
User-Agent Loom/3.14 (http://gmane.org/)
X-Loom-IP 77.2.146.136 (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0)
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.2438.1369936471.3114.python-list@python.org> (permalink)
Lines 73
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1369936471 news.xs4all.nl 15997 [2001:888:2000:d::a6]:40402
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:46520

Show key headers only | View raw


Antoine Pitrou wrote:

>Hi,

>Wolfgang Maier <wolfgang.maier <at> biologie.uni-freiburg.de> writes:
>> 
>> Dear all,
>> I was just experimenting for the first time with os.posix_fadvise(), which
>> is new in Python3.3 . I'm reading from a really huge file (several GB) and I
>> want to use the data only once, so I don't want OS-level page caching. I
>> tried os.posix_fadvise with the os.POSIX_FADV_NOREUSE and with the
>> os.POSIX_FADV_DONTNEED flags, but neither seemed to have any effect on the
>> caching behaviour of Ubuntu (still uses all available memory to page cache
>> my I/O).
>> Specifically, I was trying this:
>> 
>> import os
>> fd = os.open('myfile', os.O_RDONLY)
>> # wasn't sure about the len parameter in fadvise,
>> # so thought I just use it on the first 4GB
>> os.posix_fadvise(fd, 0, 4000000000, os.POSIX_FADV_NOREUSE) # or DONTNEED
>
>The Linux version of "man posix_fadvise" probably holds the answer:
>
>"In kernels before 2.6.18, POSIX_FADV_NOREUSE had the same semantics
>as POSIX_FADV_WILLNEED.  This was probably a bug; since kernel
>2.6.18, this flag is a no-op."
>
>"POSIX_FADV_DONTNEED attempts to free cached pages associated with the
>specified region.  This is useful, for example, while streaming large
>files.  A program may periodically request the kernel to free cached
>data that has already been used, so that more useful cached pages  are
>not discarded instead."
>
>So, in summary:
>
>- POSIX_FADV_NOREUSE doesn't do anything on (modern) Linux kernels
>- POSIX_FADV_DONTNEED must be called *after* you are done with a range of
>  data, not before you read it (note that I haven't tested to confirm it >:-))
>
>Regards
>
>Antoine.

Hi Antoine,
you're right and thanks a lot for this great piece of information.
The following quick check works like a charm now:

>>> fo = open('myfile', 'rb')
>>> chunk_size = 16184
>>> last_flush = 0
>>> d = fo.read(chunk_size)
>>> pos = chunk_size
>>> while d:
... 	d = fo.read(chunk_size)
... 	pos += chunk_size
... 	if pos > 2000000000:
... 		print ('another 2GB read, flushing')
... 		os.posix_fadvise(fo.fileno(), last_flush, last_flush+pos,
os.POSIX_FADV_DONTNEED)
... 		last_flush += pos
...             pos = 0

With this page caching for my huge file (30 GB in that case) still occurs,
of course, but it never occupies more than 2 GB of memory. This way it
should interfere less with cached data of other applications.
Have to test carefully how much that improves overall performance of the
system, but for the moment I'm more than happy!
Best wishes,
Wolfgang

P.S.: Maybe these new os module features could use a bit more documentation? 

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

usage of os.posix_fadvise Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2013-05-30 17:54 +0000

csiph-web