Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #5132 > unrolled thread

py3k buffered IO - flush() required between read/write?

Started byGenstein <genstein@invalid.invalid>
First post2011-05-11 17:27 +0100
Last post2011-05-11 23:24 +0000
Articles 8 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  py3k buffered IO - flush() required between read/write? Genstein <genstein@invalid.invalid> - 2011-05-11 17:27 +0100
    Re: py3k buffered IO - flush() required between read/write? Terry Reedy <tjreedy@udel.edu> - 2011-05-11 14:24 -0400
      Re: py3k buffered IO - flush() required between read/write? Genstein <genstein@invalid.invalid> - 2011-05-11 20:08 +0100
        Re: py3k buffered IO - flush() required between read/write? Terry Reedy <tjreedy@udel.edu> - 2011-05-11 17:38 -0400
          Re: py3k buffered IO - flush() required between read/write? Genstein <genstein@invalid.invalid> - 2011-05-12 14:30 +0100
            Re: py3k buffered IO - flush() required between read/write? Terry Reedy <tjreedy@udel.edu> - 2011-05-12 15:44 -0400
              Re: py3k buffered IO - flush() required between read/write? Genstein <genstein@invalid.invalid> - 2011-05-12 21:38 +0100
        Re: py3k buffered IO - flush() required between read/write? "Martin P. Hellwig" <martin.hellwig@gmail.com> - 2011-05-11 23:24 +0000

#5132 — py3k buffered IO - flush() required between read/write?

FromGenstein <genstein@invalid.invalid>
Date2011-05-11 17:27 +0100
Subjectpy3k buffered IO - flush() required between read/write?
Message-ID<iqedg8$k5a$7@dont-email.me>
Hey all,

Apologies if this is a dumb question (self = Python noob), but under 
py3k is it necessary to flush() a file between read/write calls in order 
to see consistent results?

I ask because I have a case under Python 3.2 (r32:88445) where it does 
appear to be, on both Gentoo Linux and Windows Vista.

I've naturally read http://docs.python.org/py3k/library/io.html and 
http://docs.python.org/py3k/tutorial/inputoutput.html#reading-and-writing-files 
but could find no reference to such a requirement.

PEP 3116 suggested this might not be required in py3k and the 
implementation notes in bufferedio.c state "BufferedReader, 
BufferedWriter and BufferedRandom...share a single buffer...this enables 
interleaved reads and writes without flushing." Which seemed conclusive 
but I'm seeing otherwise.

I have a test case, which is sadly rather long: 
http://pastebin.com/xqrzKr5D It's lengthy because it's autogenerated 
from some rather more complex code I'm working on, in order to reproduce 
the issue in isolation.

Any advice and/or flames appreciated.

All the best,

	-eg.

[toc] | [next] | [standalone]


#5148

FromTerry Reedy <tjreedy@udel.edu>
Date2011-05-11 14:24 -0400
Message-ID<mailman.1422.1305138322.9059.python-list@python.org>
In reply to#5132
On 5/11/2011 12:27 PM, Genstein wrote:

> In py3k is it necessary to flush() a file between read/write calls in order
> to see consistent results?
>
> I ask because I have a case under Python 3.2 (r32:88445) where it does
> appear to be, on both Gentoo Linux and Windows Vista.
>
> I've naturally read http://docs.python.org/py3k/library/io.html and
> http://docs.python.org/py3k/tutorial/inputoutput.html#reading-and-writing-files
> but could find no reference to such a requirement.
>
> PEP 3116 suggested this might not be required in py3k and the
> implementation notes in bufferedio.c state "BufferedReader,
> BufferedWriter and BufferedRandom...share a single buffer...this enables
> interleaved reads and writes without flushing." Which seemed conclusive
> but I'm seeing otherwise.
>
> I have a test case, which is sadly rather long:
> http://pastebin.com/xqrzKr5D It's lengthy because it's autogenerated
> from some rather more complex code I'm working on, in order to reproduce
> the issue in isolation.

I notice that you have required seek calls when switching between 
writing and reading. If you want others to look at this more, you should 
1) produce a minimal* example that demonstrates the questionable 
behavior, and 2) show the comparative outputs that raise your question. 
The code is way too long to cut and paste into an editor and see what is 
does on my windows machine.

*minimal = local minimum rather than global minimum. That means that 
removal or condensation of a line or lines removes the problem. In this 
case, remove extra seeks, unless doing so removes behavior discrepancy. 
Condense 1 byte writes to multibyte writes, unless ... . Are repeated 
interleavings required or is write, seek, read, seek, write enough?

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#5153

FromGenstein <genstein@invalid.invalid>
Date2011-05-11 20:08 +0100
Message-ID<iqemrg$jp4$1@dont-email.me>
In reply to#5148
On 11/05/2011 19:24, Terry Reedy wrote:
> writing and reading. If you want others to look at this more, you should
> 1) produce a minimal* example that demonstrates the questionable
> behavior, and 2) show the comparative outputs that raise your question.

Thanks for a quick response. Perhaps I was being unclear - in py3k, 
given the following code and assuming no errors arise:

 > f = open("foo", "w+b")
 > f.write(b'test')
 > f.seek(0)
 > print(f.read(4))

What is the printed result supposed to be?

i) b'test'
ii) never b'test'
iii) platform dependent/undefined/other

All the best,

	-eg.

[toc] | [prev] | [next] | [standalone]


#5164

FromTerry Reedy <tjreedy@udel.edu>
Date2011-05-11 17:38 -0400
Message-ID<mailman.1431.1305149918.9059.python-list@python.org>
In reply to#5153
On 5/11/2011 3:08 PM, Genstein wrote:
> On 11/05/2011 19:24, Terry Reedy wrote:
>> writing and reading. If you want others to look at this more, you should
>> 1) produce a minimal* example that demonstrates the questionable
>> behavior, and 2) show the comparative outputs that raise your question.
>
> Thanks for a quick response. Perhaps I was being unclear - in py3k,
> given the following code and assuming no errors arise:
>
>  > f = open("foo", "w+b")
>  > f.write(b'test')
>  > f.seek(0)
>  > print(f.read(4))
>
> What is the printed result supposed to be?
>
> i) b'test'
> ii) never b'test'
> iii) platform dependent/undefined/other

Good clear question. I expect i).

With 3.2 on winxp, that is what I get with StringIO, text file, and 
bytes file (the first two with b's removed). I would expect the same on 
any system. If you get anything different, I would consider it a bug

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#5235

FromGenstein <genstein@invalid.invalid>
Date2011-05-12 14:30 +0100
Message-ID<iqgne3$7kr$1@dont-email.me>
In reply to#5164
> With 3.2 on winxp, that is what I get with StringIO, text file, and
> bytes file (the first two with b's removed). I would expect the same on
> any system. If you get anything different, I would consider it a bug

Thanks Terry, you're entirely right there; I trimmed down my test case, 
asked for confirmation and have reported it as 
http://bugs.python.org/issue12062. Noted here in case anyone else trips 
over it.

[toc] | [prev] | [next] | [standalone]


#5253

FromTerry Reedy <tjreedy@udel.edu>
Date2011-05-12 15:44 -0400
Message-ID<mailman.1488.1305229485.9059.python-list@python.org>
In reply to#5235
On 5/12/2011 9:30 AM, Genstein wrote:
>> With 3.2 on winxp, that is what I get with StringIO, text file, and
>> bytes file (the first two with b's removed). I would expect the same on
>> any system. If you get anything different, I would consider it a bug
>
> Thanks Terry, you're entirely right there; I trimmed down my test case,
> asked for confirmation and have reported it as
> http://bugs.python.org/issue12062. Noted here in case anyone else trips
> over it.

I want people to know that with a simple, minimal, easy to run and 
reproduce and think about test case posted, more info, more test cases, 
and probable fixes were posted within an hour. (Fixes are not always 
that quick, but stripping away irrelevancies really helps speed the 
process.)

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#5262

FromGenstein <genstein@invalid.invalid>
Date2011-05-12 21:38 +0100
Message-ID<iqhgi6$dtl$3@dont-email.me>
In reply to#5253
On 12/05/2011 20:44, Terry Reedy wrote:
> I want people to know that with a simple, minimal, easy to run and
> reproduce and think about test case posted, more info, more test cases,
> and probable fixes were posted within an hour. (Fixes are not always
> that quick, but stripping away irrelevancies really helps speed the
> process.)

A very good point. I'm extremely impressed with the speed and deftness 
which the bug was handled once raised. Hats off to the people involved.

I should have posted a short test case initially, but I knew it would 
take some time for me to produce and didn't want to go that far if it 
was clear to everyone but me that flushes were required by design :)

Thanks again,
	-eg.

[toc] | [prev] | [next] | [standalone]


#5170

From"Martin P. Hellwig" <martin.hellwig@gmail.com>
Date2011-05-11 23:24 +0000
Message-ID<iqf2bd$9n2$1@dont-email.me>
In reply to#5153
On 11/05/2011 19:08, Genstein wrote:
> On 11/05/2011 19:24, Terry Reedy wrote:
>> writing and reading. If you want others to look at this more, you should
>> 1) produce a minimal* example that demonstrates the questionable
>> behavior, and 2) show the comparative outputs that raise your question.
>
> Thanks for a quick response. Perhaps I was being unclear - in py3k,
> given the following code and assuming no errors arise:
>
>  > f = open("foo", "w+b")
>  > f.write(b'test')
>  > f.seek(0)
>  > print(f.read(4))
>
> What is the printed result supposed to be?
>
> i) b'test'
> ii) never b'test'
> iii) platform dependent/undefined/other
>
> All the best,
>
> -eg.

from:
http://docs.python.org/py3k/library/functions.html#open
"""
open(file, mode='r', buffering=-1, encoding=None, errors=None, 
newline=None, closefd=True)¶
<cut>
buffering is an optional integer used to set the buffering policy. Pass 
0 to switch buffering off (only allowed in binary mode), 1 to select 
line buffering (only usable in text mode), and an integer > 1 to 
indicate the size of a fixed-size chunk buffer. When no buffering 
argument is given, the default buffering policy works as follows:

     * Binary files are buffered in fixed-size chunks; the size of the 
buffer is chosen using a heuristic trying to determine the underlying 
device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On 
many systems, the buffer will typically be 4096 or 8192 bytes long.
     * “Interactive” text files (files for which isatty() returns True) 
use line buffering. Other text files use the policy described above for 
binary files.
"""

So given that explanation, and assuming I understand it, I go for option 
'iii'.

-- 
mph

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web