Groups > comp.lang.python > #27777 > unrolled thread

Built-in open() with buffering > 1

Started by	Marco <marco_u@nsgmail.com>
First post	2012-08-24 06:35 +0200
Last post	2012-08-30 17:39 +0200
Articles	5 — 3 participants

Back to article view | Back to comp.lang.python

  Built-in open() with buffering > 1 Marco <marco_u@nsgmail.com> - 2012-08-24 06:35 +0200
    Re: Built-in open() with buffering > 1 Marco <marco_u@nsgmail.com> - 2012-08-24 07:21 +0200
      Re: Built-in open() with buffering > 1 Ramchandra Apte <maniandram01@gmail.com> - 2012-08-24 07:32 -0700
    Re: Built-in open() with buffering > 1 Hans Mulder <hansmu@xs4all.nl> - 2012-08-26 10:25 +0200
      Re: Built-in open() with buffering > 1 Marco <marco_u@nsgmail.com> - 2012-08-30 17:39 +0200

#27777 — Built-in open() with buffering > 1

From	Marco <marco_u@nsgmail.com>
Date	2012-08-24 06:35 +0200
Subject	Built-in open() with buffering > 1
Message-ID	<k170ae$vbe$1@speranza.aioe.org>

Please, can anyone explain me the meaning of the
"buffering > 1" in the built-in open()?
The doc says: "...and an integer > 1 to indicate the size
of a fixed-size chunk buffer."
So I thought this size was the number of bytes or chars, but
it is not:

 >>> f = open('myfile', 'w', buffering=2)
 >>> f.write('a')
1
 >>> open('myfile').read()
''
 >>> f.write('b')
1
 >>> open('myfile').read()
''
 >>> f.write('cdefghi\n')
8
 >>> open('myfile').read()
''
 >>> f.flush()
 >>> open('myfile').read()
'abcdefghi\n'

Regards,
Marco

[toc] | [next] | [standalone]

#27778

From	Marco <marco_u@nsgmail.com>
Date	2012-08-24 07:21 +0200
Message-ID	<k1730u$466$1@speranza.aioe.org>
In reply to	#27777

On 08/24/2012 06:35 AM, Marco wrote:
> Please, can anyone explain me the meaning of the
> "buffering > 1" in the built-in open()?
> The doc says: "...and an integer > 1 to indicate the size
> of a fixed-size chunk buffer."

Sorry, I get it:

 >>> f = open('myfile', 'w', buffering=2)
 >>> f._CHUNK_SIZE = 5
 >>> for i in range(6):
...     n = f.write(str(i))
...     print(i, open('myfile').read(), sep=':')
...
0:
1:
2:
3:
4:
5:012345

[toc] | [prev] | [next] | [standalone]

#27800

From	Ramchandra Apte <maniandram01@gmail.com>
Date	2012-08-24 07:32 -0700
Message-ID	<b9e93b2e-dfa6-4861-9f06-67746eb6f346@googlegroups.com>
In reply to	#27778

`f._CHUNK_SIZE = 5` is modifying Python's internal variables - don't do that
google buffering to find out what it is
buffering is how much Python will keep in memory
f.read(1) will actually read `buffering` bytes of memory so that when you read later, the reading can be done from memory
On Friday, 24 August 2012 10:51:36 UTC+5:30, Marco  wrote:
> On 08/24/2012 06:35 AM, Marco wrote:
> 
> > Please, can anyone explain me the meaning of the
> 
> > "buffering > 1" in the built-in open()?
> 
> > The doc says: "...and an integer > 1 to indicate the size
> 
> > of a fixed-size chunk buffer."
> 
> 
> 
> Sorry, I get it:
> 
> 
> 
>  >>> f = open('myfile', 'w', buffering=2)
> 
>  >>> f._CHUNK_SIZE = 5
> 
>  >>> for i in range(6):
> 
> ...     n = f.write(str(i))
> 
> ...     print(i, open('myfile').read(), sep=':')
> 
> ...
> 
> 0:
> 
> 1:
> 
> 2:
> 
> 3:
> 
> 4:
> 
> 5:012345

[toc] | [prev] | [next] | [standalone]

#27910

From	Hans Mulder <hansmu@xs4all.nl>
Date	2012-08-26 10:25 +0200
Message-ID	<5039dd87$0$6851$e4fe514c@news2.news.xs4all.nl>
In reply to	#27777

On 24/08/12 06:35:27, Marco wrote:
> Please, can anyone explain me the meaning of the
> "buffering > 1" in the built-in open()?
> The doc says: "...and an integer > 1 to indicate the size
> of a fixed-size chunk buffer."
> So I thought this size was the number of bytes or chars, but
> it is not

The algorithm is explained at
http://docs.python.org/library/io.html#io.DEFAULT_BUFFER_SIZE

>> io.DEFAULT_BUFFER_SIZE
>>
>> An int containing the default buffer size used by the
>> module’s buffered I/O classes. open() uses the file’s
>> blksize (as obtained by os.stat()) if possible.

In other words: open() tries to find a suitable size by
calling os.stat(your_file).st_blksize and if that fails,
it uses io.DEFAULT_BUFFER_SIZE, which is 8192 on my box.

Whether you call open with buffering=2 or any larger
number, does not matter: the buffer size will be the
outcome of this algorithm.

Hope this helps,

-- HansM

[toc] | [prev] | [next] | [standalone]

#28131

From	Marco <marco_u@nsgmail.com>
Date	2012-08-30 17:39 +0200
Message-ID	<k1o1f1$60p$1@speranza.aioe.org>
In reply to	#27910

On 08/26/2012 10:25 AM, Hans Mulder wrote:

> The algorithm is explained at
> http://docs.python.org/library/io.html#io.DEFAULT_BUFFER_SIZE

Thanks ;)

> In other words: open() tries to find a suitable size by
> calling os.stat(your_file).st_blksize and if that fails,
> it uses io.DEFAULT_BUFFER_SIZE, which is 8192 on my box.

Yes, when the parameter `buffering` is a negative integer
that is right

> Whether you call open with buffering=2 or any larger
> number, does not matter: the buffer size will be the
> outcome of this algorithm.

Mmm, I think it is not right, because in this case
the buffer size is not computed but it is
the value you assign to the buffering parameter.
In fact:

 >>> f = open('myfile', 'w', buffering=2)
 >>> f._CHUNK_SIZE = 1
 >>> f.write('ab')
2
 >>> open('myfile').read()

Now two bytes are in the buffer and the buffer is full.
If you write another byte, it will not be written in the
buffer, because the bytes in the queue will be transferred
into the buffer only when they are more than f._CHUNK_SIZE:

 >>> f.write('c')
1
 >>> open('myfile').read()

Now, if you write another byte 'd', the chunk 'cd' will
be transferred to the buffer, but because it is full,
its content 'ab' will be transferred to the disk, and
after 'cd' written to the buffer, that still full:

 >>> f.write('d')
1
 >>> open('myfile').read()
'ab'

So, the buffer is really of size 2

[toc] | [prev] | [standalone]

csiph-web

Built-in open() with buffering > 1

Contents

#27777 — Built-in open() with buffering > 1

#27778

#27800

#27910

#28131