Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52812 > unrolled thread

Running a command line program and reading the result as it runs

Started byIan Simcock <Ian.Simcock@Internode.on.net>
First post2013-08-22 15:21 +0930
Last post2013-08-23 18:39 +0200
Articles 17 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  Running a command line program and reading the result as it runs Ian Simcock <Ian.Simcock@Internode.on.net> - 2013-08-22 15:21 +0930
    Re: Running a command line program and reading the result as it runs Chris Angelico <rosuav@gmail.com> - 2013-08-22 16:22 +1000
      Re: Running a command line program and reading the result as it runs Ian Simcock <Ian.Simcock@Internode.on.net> - 2013-08-23 00:56 +0930
        Re: Running a command line program and reading the result as it runs Chris Angelico <rosuav@gmail.com> - 2013-08-23 01:33 +1000
          Re: Running a command line program and reading the result as it runs Ian Simcock <Ian.Simcock@Internode.on.net> - 2013-08-23 16:22 +0930
          Re: Running a command line program and reading the result as it runs Grant Edwards <invalid@invalid.invalid> - 2013-08-23 14:02 +0000
    Re: Running a command line program and reading the result as it runs Rob Wolfe <rw@smsnet.pl> - 2013-08-22 23:14 +0200
      Re: Running a command line program and reading the result as it runs Ian Simcock <Ian.Simcock@Internode.on.net> - 2013-08-23 16:31 +0930
    Re: Running a command line program and reading the result as it runs Gertjan Klein <gklein@xs4all.nl> - 2013-08-23 11:32 +0200
    Re: Running a command line program and reading the result as it runs Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-08-23 11:53 +0200
    Re: Running a command line program and reading the result as it runs Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-08-23 12:34 +0200
    RE: Running a command line program and reading the result as it runs "Joseph L. Casale" <jcasale@activenetwerx.com> - 2013-08-23 10:50 +0000
    Re: Running a command line program and reading the result as it runs Peter Otten <__peter__@web.de> - 2013-08-23 13:14 +0200
      Re: Running a command line program and reading the result as it runs Gertjan Klein <gklein@xs4all.nl> - 2013-08-23 14:03 +0200
      Re: Running a command line program and reading the result as it runs Ian Simcock <Ian.Simcock@Internode.on.net> - 2013-08-24 19:06 +0930
    Re: Running a command line program and reading the result as it runs random832@fastmail.us - 2013-08-23 12:04 -0400
    Re: Running a command line program and reading the result as it runs Peter Otten <__peter__@web.de> - 2013-08-23 18:39 +0200

#52812 — Running a command line program and reading the result as it runs

FromIan Simcock <Ian.Simcock@Internode.on.net>
Date2013-08-22 15:21 +0930
SubjectRunning a command line program and reading the result as it runs
Message-ID<5215a6cf$0$6512$c3e8da3$5496439d@news.astraweb.com>
Greetings all.

I'm using Python 2.7 under Windows and am trying to run a command line 
program and process the programs output as it is running. A number of 
web searches have indicated that the following code would work.

import subprocess

p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
                      stdout=subprocess.PIPE,
                      stderr=subprocess.STDOUT,
                      bufsize=1,
                      universal_newlines=True,
                      shell=False)
for line in p.stdout:
     print line

When I use this code I can see that the Popen works, any code between 
the Popen and the for will run straight away, but as soon as it gets to 
the for and tries to read p.stdout the code blocks until the command 
line program completes, then all of the lines are returned.

Does anyone know how to get the results of the program without it blocking?

Thanks,
Ian Simcock.

[toc] | [next] | [standalone]


#52814

FromChris Angelico <rosuav@gmail.com>
Date2013-08-22 16:22 +1000
Message-ID<mailman.119.1377152952.19984.python-list@python.org>
In reply to#52812
On Thu, Aug 22, 2013 at 3:51 PM, Ian Simcock
<Ian.Simcock@internode.on.net> wrote:
> When I use this code I can see that the Popen works, any code between the
> Popen and the for will run straight away, but as soon as it gets to the for
> and tries to read p.stdout the code blocks until the command line program
> completes, then all of the lines are returned.
>
> Does anyone know how to get the results of the program without it blocking?

Is the program actually producing output progressively? I just tried
your exact code with "dir /ad /s /b" and it worked fine, producing
output while the dir was still spinning (obviously setting shell=True
to make that work, but I don't think that'll make a difference). It
may be that pip buffers its output. Is there a parameter to pip to
make it pipe-compatible?

ChrisA

[toc] | [prev] | [next] | [standalone]


#52837

FromIan Simcock <Ian.Simcock@Internode.on.net>
Date2013-08-23 00:56 +0930
Message-ID<52162da8$0$29987$c3e8da3$5496439d@news.astraweb.com>
In reply to#52814
Chris Angelico wrote:
> Is the program actually producing output progressively? I just tried
> your exact code with "dir /ad /s /b" and it worked fine, producing
> output while the dir was still spinning (obviously setting shell=True
> to make that work, but I don't think that'll make a difference). It
> may be that pip buffers its output. Is there a parameter to pip to
> make it pipe-compatible?
>
> ChrisA
>

If I run pip in the command window I can see it's output appearing line 
by line rather than on one block.

I tried the code with the dir command but it's too fast for me to be 
sure if it's working or not.

I tried again using the command "ping google.com" instead since I know 
that output's slowly and it something that everyone should have. In the 
command window I can see that the output appears over time, but from 
python I get nothing for a while and then suddenly get all the output in 
one rapid go.

Can you think of anything else I can look at?

Ian Simcock.

[toc] | [prev] | [next] | [standalone]


#52839

FromChris Angelico <rosuav@gmail.com>
Date2013-08-23 01:33 +1000
Message-ID<mailman.136.1377185595.19984.python-list@python.org>
In reply to#52837
On Fri, Aug 23, 2013 at 1:26 AM, Ian Simcock
<Ian.Simcock@internode.on.net> wrote:
> Chris Angelico wrote:
>>
>> Is the program actually producing output progressively? I just tried
>> your exact code with "dir /ad /s /b" and it worked fine, producing
>> output while the dir was still spinning (obviously setting shell=True
>> to make that work, but I don't think that'll make a difference). It
>> may be that pip buffers its output. Is there a parameter to pip to
>> make it pipe-compatible?
>>
>> ChrisA
>>
>
> If I run pip in the command window I can see it's output appearing line by
> line rather than on one block.
>
> I tried the code with the dir command but it's too fast for me to be sure if
> it's working or not.
>
> I tried again using the command "ping google.com" instead since I know that
> output's slowly and it something that everyone should have. In the command
> window I can see that the output appears over time, but from python I get
> nothing for a while and then suddenly get all the output in one rapid go.
>
>
> Can you think of anything else I can look at?

A lot of programs, when their output is not going to the console, will
buffer output. It's more efficient for many purposes. With Unix
utilities, there's often a parameter like --pipe or --unbuffered that
says "please produce output line by line", but Windows ping doesn't
have that - and so I'm seeing the same thing you are.

You should be able to see the time delay in dir by looking for some
particular directory name, and searching from the root directory.
Unless you're on a BLAZINGLY fast drive, that'll take Windows a good
while!

ChrisA

[toc] | [prev] | [next] | [standalone]


#52866

FromIan Simcock <Ian.Simcock@Internode.on.net>
Date2013-08-23 16:22 +0930
Message-ID<52170699$0$29998$c3e8da3$5496439d@news.astraweb.com>
In reply to#52839
Chris Angelico wrote:
> On Fri, Aug 23, 2013 at 1:26 AM, Ian Simcock
> <Ian.Simcock@internode.on.net> wrote:
>> Chris Angelico wrote:
>>>
>
> A lot of programs, when their output is not going to the console, will
> buffer output. It's more efficient for many purposes. With Unix
> utilities, there's often a parameter like --pipe or --unbuffered that
> says "please produce output line by line", but Windows ping doesn't
> have that - and so I'm seeing the same thing you are.
>
> You should be able to see the time delay in dir by looking for some
> particular directory name, and searching from the root directory.
> Unless you're on a BLAZINGLY fast drive, that'll take Windows a good
> while!
>
> ChrisA

I tried it again with the dir command and, while my drive is pretty 
fast, it does look like it works.

I've done come looking around and found that the standard C libraries 
apparently automatically buffer output when the output is being 
redirected to a file handle unless specifically told not to.

I did a further test and created a unique file name in the root of my D 
drive and then use dir to search the entire drive for that name. In the 
command window the name appears instantly and then after a slight pause 
the command prompt reappears. When run from python however the pause 
comes first and then the name appears and then the command prompt returns.

So yep, seems like I'm screwed :-)

Thanks for your help with this. At least now I know it's not that I'm 
doing something wrong.

Ian Simcock.

[toc] | [prev] | [next] | [standalone]


#52891

FromGrant Edwards <invalid@invalid.invalid>
Date2013-08-23 14:02 +0000
Message-ID<kv7q0q$cqj$1@reader1.panix.com>
In reply to#52839
On 2013-08-22, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Aug 23, 2013 at 1:26 AM, Ian Simcock
><Ian.Simcock@internode.on.net> wrote:
>> Chris Angelico wrote:
>>>
>>> Is the program actually producing output progressively? I just tried
>>> your exact code with "dir /ad /s /b" and it worked fine, producing
>>> output while the dir was still spinning (obviously setting shell=True
>>> to make that work, but I don't think that'll make a difference). It
>>> may be that pip buffers its output. Is there a parameter to pip to
>>> make it pipe-compatible?
>>>
>>> ChrisA
>>>
>>
>> If I run pip in the command window I can see it's output appearing line by
>> line rather than on one block.
>>
>> I tried the code with the dir command but it's too fast for me to be sure if
>> it's working or not.
>>
>> I tried again using the command "ping google.com" instead since I know that
>> output's slowly and it something that everyone should have. In the command
>> window I can see that the output appears over time, but from python I get
>> nothing for a while and then suddenly get all the output in one rapid go.
>>
>>
>> Can you think of anything else I can look at?
>
> A lot of programs, when their output is not going to the console,
> will buffer output. It's more efficient for many purposes. With Unix
> utilities, there's often a parameter like --pipe or --unbuffered that
> says "please produce output line by line", but Windows ping doesn't
> have that - and so I'm seeing the same thing you are.

Another way this problem can be avoided on Unix is to connect the
slave end of a pty (instead of a pipe) to the command's stdout/stderr
and then read the command's output from the master end of the pty.
[On Unix, the buffering decision is based on whether stdout is a tty
device, not on whether it's the console.]

Dunno whether Windows has ptys or not.  They're a very simple, elegent
solution to a number of problems, so I'm guessing not. ;)

-- 
Grant Edwards               grant.b.edwards        Yow! ... the MYSTERIANS are
                                  at               in here with my CORDUROY
                              gmail.com            SOAP DISH!!

[toc] | [prev] | [next] | [standalone]


#52845

FromRob Wolfe <rw@smsnet.pl>
Date2013-08-22 23:14 +0200
Message-ID<87y57to2bx.fsf@smsnet.pl>
In reply to#52812
Ian Simcock <Ian.Simcock@Internode.on.net> writes:

> Greetings all.
>
> I'm using Python 2.7 under Windows and am trying to run a command line
> program and process the programs output as it is running. A number of
> web searches have indicated that the following code would work.
>
> import subprocess
>
> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
>                      stdout=subprocess.PIPE,
>                      stderr=subprocess.STDOUT,
>                      bufsize=1,
>                      universal_newlines=True,
>                      shell=False)
> for line in p.stdout:
>     print line
>
> When I use this code I can see that the Popen works, any code between
> the Popen and the for will run straight away, but as soon as it gets
> to the for and tries to read p.stdout the code blocks until the
> command line program completes, then all of the lines are returned.
>
> Does anyone know how to get the results of the program without it blocking?

When file object is used in a for loop it works like an iterator
and then it uses a hidden read-ahead buffer. 
It might cause this kind of blocking.
You can read more details here (description of method ``next``):
http://docs.python.org/lib/bltin-file-objects.html

So basically non-blocking loop might look like this:

while True:
    line = p.stdout.readline()
    if not line: break
    print line

HTH,
Rob

[toc] | [prev] | [next] | [standalone]


#52867

FromIan Simcock <Ian.Simcock@Internode.on.net>
Date2013-08-23 16:31 +0930
Message-ID<521708d5$0$29998$c3e8da3$5496439d@news.astraweb.com>
In reply to#52845
Rob Wolfe wrote:
> Ian Simcock <Ian.Simcock@Internode.on.net> writes:
>
> When file object is used in a for loop it works like an iterator
> and then it uses a hidden read-ahead buffer.
> It might cause this kind of blocking.
> You can read more details here (description of method ``next``):
> http://docs.python.org/lib/bltin-file-objects.html
>
> So basically non-blocking loop might look like this:
>
> while True:
>      line = p.stdout.readline()
>      if not line: break
>      print line
>
> HTH,
> Rob
>

Thanks, but some further research seems to indicate that the problem is 
that the standard C libraries are probably buffering the output when the 
it's being redirected, so the problem is coming from the command line 
tool rather than the python code.

Ian Simcock.

[toc] | [prev] | [next] | [standalone]


#52874

FromGertjan Klein <gklein@xs4all.nl>
Date2013-08-23 11:32 +0200
Message-ID<52172c10$0$15879$e4fe514c@news2.news.xs4all.nl>
In reply to#52812
Ian Simcock wrote:

> When I use this code I can see that the Popen works, any code between
> the Popen and the for will run straight away, but as soon as it gets to
> the for and tries to read p.stdout the code blocks until the command
> line program completes, then all of the lines are returned.
>
> Does anyone know how to get the results of the program without it blocking?

I have tried your code with "ping google.com" as command and got the 
same results; apparently something buffers the output. The result is 
different when using Python 3.3: there, the lines are printed as they 
come in. This seems to indicate a bug in the Python 2.7 implementation.

There are some bug reports on bugs.python.org that may be related; see 
for example:

http://bugs.python.org/issue15532

I have been playing around a bit with the suggested approach of using 
the io library directly. I managed to get unbuffered output, but 
unfortunately the program hangs when the subprocess is done. It can't 
even be terminated with Control-C, I have to use task manager to kill 
python.exe.

Below is as far as I got; perhaps someone with more experience with 
pipes knows how to fix this.

Regards,
Gertjan.


#!/usr/bin/env python2.7
# coding: CP1252

from __future__ import print_function
import subprocess
import io, os

def main():
     i, o = os.pipe()
     piperead = io.open(i, 'rb', buffering=1)

     p = subprocess.Popen(["ping", "google.com"],
                           stdout=o,
                           stderr=subprocess.PIPE,
                           bufsize=0,
                           shell=False)

     for line in piperead:
         print(line)

if __name__ == '__main__':
     main()

[toc] | [prev] | [next] | [standalone]


#52875

FromAntoon Pardon <antoon.pardon@rece.vub.ac.be>
Date2013-08-23 11:53 +0200
Message-ID<mailman.158.1377251601.19984.python-list@python.org>
In reply to#52812
Op 22-08-13 07:51, Ian Simcock schreef:
> Greetings all.
> 
> I'm using Python 2.7 under Windows and am trying to run a command line
> program and process the programs output as it is running. A number of
> web searches have indicated that the following code would work.
> 
> import subprocess
> 
> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
>                      stdout=subprocess.PIPE,
>                      stderr=subprocess.STDOUT,
>                      bufsize=1,
>                      universal_newlines=True,
>                      shell=False)
> for line in p.stdout:
>     print line
> 
> When I use this code I can see that the Popen works, any code between
> the Popen and the for will run straight away, but as soon as it gets to
> the for and tries to read p.stdout the code blocks until the command
> line program completes, then all of the lines are returned.
> 
> Does anyone know how to get the results of the program without it blocking?

Maybe the following can work?

Untested code:

from pty import openpty
from subprocess import Popen

master, slave = openpty()

p = Popen("D:\Python\Python27\Scripts\pip.exe list -o",
          stdout = slave,
          stderr = slave,
          stdin = slave,
          close_fds = True)

for line in master:
    print line


The idea is to set a a pseudo terminal for pip so that the
system thinks pip is doing IO with a terminal and so the
IO will be line buffered. But all IO from pip will be available
through the master in your program.

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]


#52876

FromAntoon Pardon <antoon.pardon@rece.vub.ac.be>
Date2013-08-23 12:34 +0200
Message-ID<mailman.159.1377254100.19984.python-list@python.org>
In reply to#52812
Op 23-08-13 11:53, Antoon Pardon schreef:
> Op 22-08-13 07:51, Ian Simcock schreef:
>> Greetings all.
>>
>> I'm using Python 2.7 under Windows and am trying to run a command line
>> program and process the programs output as it is running. A number of
>> web searches have indicated that the following code would work.
>>
>> import subprocess
>>
>> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
>>                      stdout=subprocess.PIPE,
>>                      stderr=subprocess.STDOUT,
>>                      bufsize=1,
>>                      universal_newlines=True,
>>                      shell=False)
>> for line in p.stdout:
>>     print line
>>
>> When I use this code I can see that the Popen works, any code between
>> the Popen and the for will run straight away, but as soon as it gets to
>> the for and tries to read p.stdout the code blocks until the command
>> line program completes, then all of the lines are returned.
>>
>> Does anyone know how to get the results of the program without it blocking?
> 
> Maybe the following can work?

Never mind. I had overlooked that using pty requires linux and you are
using windows.

-- 
Antoon Pardon

[toc] | [prev] | [next] | [standalone]


#52877

From"Joseph L. Casale" <jcasale@activenetwerx.com>
Date2013-08-23 10:50 +0000
Message-ID<mailman.160.1377255090.19984.python-list@python.org>
In reply to#52812
> >> I'm using Python 2.7 under Windows and am trying to run a command line
> >> program and process the programs output as it is running. A number of
> >> web searches have indicated that the following code would work.
> >>
> >> import subprocess
> >>
> >> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
> >>                      stdout=subprocess.PIPE,
> >>                      stderr=subprocess.STDOUT,
> >>                      bufsize=1,
> >>                      universal_newlines=True,
> >>                      shell=False)
> >> for line in p.stdout:
> >>     print line
> >>
> >> When I use this code I can see that the Popen works, any code between
> >> the Popen and the for will run straight away, but as soon as it gets to
> >> the for and tries to read p.stdout the code blocks until the command
> >> line program completes, then all of the lines are returned.
> >>
> >> Does anyone know how to get the results of the program without it
> >> blocking?

Try this:

p = subprocess.Popen(args, stdout=subprocess.PIPE)
for line in p.stdout:
    print(line)
p.wait()

jlc

[toc] | [prev] | [next] | [standalone]


#52878

FromPeter Otten <__peter__@web.de>
Date2013-08-23 13:14 +0200
Message-ID<mailman.161.1377256485.19984.python-list@python.org>
In reply to#52812
Ian Simcock wrote:

> Greetings all.
> 
> I'm using Python 2.7 under Windows and am trying to run a command line
> program and process the programs output as it is running. A number of
> web searches have indicated that the following code would work.
> 
> import subprocess
> 
> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
>                       stdout=subprocess.PIPE,
>                       stderr=subprocess.STDOUT,
>                       bufsize=1,
>                       universal_newlines=True,
>                       shell=False)
> for line in p.stdout:
>      print line
> 
> When I use this code I can see that the Popen works, any code between
> the Popen and the for will run straight away, but as soon as it gets to
> the for and tries to read p.stdout the code blocks until the command
> line program completes, then all of the lines are returned.
> 
> Does anyone know how to get the results of the program without it
> blocking?

The following works on my linux system:

import subprocess

p = subprocess.Popen(
    ["ping", "google.com"],
    stdout=subprocess.PIPE)

instream = iter(p.stdout.readline, "")
        
for line in instream:
    print line.rstrip()

I don't have Windows available to test, but if it works there, too, the 
problem is the internal buffer used by Python's implementation of file 
iteration rather than the OS.

[toc] | [prev] | [next] | [standalone]


#52881

FromGertjan Klein <gklein@xs4all.nl>
Date2013-08-23 14:03 +0200
Message-ID<52174f9f$0$15982$e4fe514c@news2.news.xs4all.nl>
In reply to#52878
Peter Otten wrote:

> The following works on my linux system:
>
> import subprocess
>
> p = subprocess.Popen(
>      ["ping", "google.com"],
>      stdout=subprocess.PIPE)
>
> instream = iter(p.stdout.readline, "")
>
> for line in instream:
>      print line.rstrip()
>
> I don't have Windows available to test, but if it works there, too, the
> problem is the internal buffer used by Python's implementation of file
> iteration rather than the OS.

Excellent, that works on Windows as well. That conclusively proves that 
the buffering problem is in Python, not in the command that is executed. 
(Although that may happen, too, for some commends.)

Regards,
Gertjan.

[toc] | [prev] | [next] | [standalone]


#52932

FromIan Simcock <Ian.Simcock@Internode.on.net>
Date2013-08-24 19:06 +0930
Message-ID<52187e99$0$29995$c3e8da3$5496439d@news.astraweb.com>
In reply to#52878
Peter Otten wrote:
> Ian Simcock wrote:
>
>> Greetings all.
>>
>> I'm using Python 2.7 under Windows and am trying to run a command line
>> program and process the programs output as it is running. A number of
>> web searches have indicated that the following code would work.
>>
>> import subprocess
>>
>> p = subprocess.Popen("D:\Python\Python27\Scripts\pip.exe list -o",
>>                        stdout=subprocess.PIPE,
>>                        stderr=subprocess.STDOUT,
>>                        bufsize=1,
>>                        universal_newlines=True,
>>                        shell=False)
>> for line in p.stdout:
>>       print line
>>
>> When I use this code I can see that the Popen works, any code between
>> the Popen and the for will run straight away, but as soon as it gets to
>> the for and tries to read p.stdout the code blocks until the command
>> line program completes, then all of the lines are returned.
>>
>> Does anyone know how to get the results of the program without it
>> blocking?
>
> The following works on my linux system:
>
> import subprocess
>
> p = subprocess.Popen(
>      ["ping", "google.com"],
>      stdout=subprocess.PIPE)
>
> instream = iter(p.stdout.readline, "")
>
> for line in instream:
>      print line.rstrip()
>
> I don't have Windows available to test, but if it works there, too, the
> problem is the internal buffer used by Python's implementation of file
> iteration rather than the OS.
>

Hmm... and so it comes full circle.

I thought that the inclusion of the iter call looked familiar so I 
checked my original code and found that it was there. I removed it when 
shrinking the code down to a minimal example for posting. Then, after 
removing it, which triggered the blocking, I changed the command to ping 
so that it's easier for anyone to test.

I've tried a copy of my original code using the ping command and it 
works fine.

So it looks like the python package manager pip must see that it's not 
going to a console and buffer the output, and my original code was not 
the problem.

So I can't do what I want, but it's interesting to know that for some 
reason the iter is required for the occasions when it can work.

Thanks to everyone who helped with this.

Ian Simcock

[toc] | [prev] | [next] | [standalone]


#52899

Fromrandom832@fastmail.us
Date2013-08-23 12:04 -0400
Message-ID<mailman.169.1377273859.19984.python-list@python.org>
In reply to#52812
On Fri, Aug 23, 2013, at 7:14, Peter Otten wrote:
> The following works on my linux system:
> 
> instream = iter(p.stdout.readline, "")
>         
> for line in instream:
>     print line.rstrip()
> 
> I don't have Windows available to test, but if it works there, too, the 
> problem is the internal buffer used by Python's implementation of file 
> iteration rather than the OS.

I can confirm this on Windows.

Doesn't this surprising difference between for line in
iter(f.readline,'') vs for line in f violate TOOWTDI? We're led to
believe from the documentation that iterating over a file does _not_
read lines into memory before returning them. It's not clear to me what
performance benefit can be gained from waiting when there is no more
data available, either.

I don't understand how it's even happening - from looking at the code,
it looks like next() just calls readline() once, no fancy buffering
specific to itself.

[toc] | [prev] | [next] | [standalone]


#52903

FromPeter Otten <__peter__@web.de>
Date2013-08-23 18:39 +0200
Message-ID<mailman.173.1377275985.19984.python-list@python.org>
In reply to#52812
random832@fastmail.us wrote:

> On Fri, Aug 23, 2013, at 7:14, Peter Otten wrote:
>> The following works on my linux system:
>> 
>> instream = iter(p.stdout.readline, "")
>>         
>> for line in instream:
>>     print line.rstrip()
>> 
>> I don't have Windows available to test, but if it works there, too, the
>> problem is the internal buffer used by Python's implementation of file
>> iteration rather than the OS.
> 
> I can confirm this on Windows.
> 
> Doesn't this surprising difference between for line in
> iter(f.readline,'') vs for line in f violate TOOWTDI? We're led to
> believe from the documentation that iterating over a file does _not_
> read lines into memory before returning them. It's not clear to me what
> performance benefit can be gained from waiting when there is no more
> data available, either.
> 
> I don't understand how it's even happening - from looking at the code,
> it looks like next() just calls readline() once, no fancy buffering
> specific to itself.

Maybe you are looking in the wrong version?

For 2.x you can use the file_iternext() function as a starting point, see:

http://hg.python.org/cpython/file/1ea833ecaf5a/Objects/fileobject.c#l2316

Python 3 uses a different approach that allows you to mix iteration and 
readline():

$ python -c 'f = open("tmp.txt"); next(f); f.readline()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ValueError: Mixing iteration and read methods would lose data
$ python3 -c 'f = open("tmp.txt"); next(f); f.readline()'

The relevant code is likely in the Modules/_io/ directory. There is also

[New I/O] http://www.python.org/dev/peps/pep-3116/

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web