Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #45894 > unrolled thread
| Started by | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| First post | 2013-05-24 07:04 -0700 |
| Last post | 2013-05-31 11:52 +0200 |
| Articles | 13 — 7 participants |
Back to article view | Back to comp.lang.python
Piping processes works with 'shell = True' but not otherwise. Luca Cerone <luca.cerone@gmail.com> - 2013-05-24 07:04 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Luca Cerone <luca.cerone@gmail.com> - 2013-05-26 03:31 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Chris Rebert <clp2@rebertia.com> - 2013-05-26 14:05 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Luca Cerone <luca.cerone@gmail.com> - 2013-05-26 16:58 -0700
RE: Piping processes works with 'shell = True' but not otherwise. Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-27 03:14 +0300
Re: Piping processes works with 'shell = True' but not otherwise. Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2013-05-29 19:39 +0200
RE: Piping processes works with 'shell = True' but not otherwise. Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-29 22:31 +0300
Re: Piping processes works with 'shell = True' but not otherwise. Cameron Simpson <cs@zip.com.au> - 2013-05-30 08:18 +1000
Re: Piping processes works with 'shell = True' but not otherwise. Chris Angelico <rosuav@gmail.com> - 2013-05-27 18:28 +1000
Re: Piping processes works with 'shell = True' but not otherwise. Luca Cerone <luca.cerone@gmail.com> - 2013-05-27 04:33 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Chris Rebert <clp2@rebertia.com> - 2013-05-29 10:17 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Luca Cerone <luca.cerone@gmail.com> - 2013-05-31 02:28 -0700
Re: Piping processes works with 'shell = True' but not otherwise. Peter Otten <__peter__@web.de> - 2013-05-31 11:52 +0200
| From | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| Date | 2013-05-24 07:04 -0700 |
| Subject | Piping processes works with 'shell = True' but not otherwise. |
| Message-ID | <062f557e-8e1a-4efb-9178-7d685b47a834@googlegroups.com> |
Hi everybody,
I am new to the group (and relatively new to Python)
so I am sorry if this issues has been discussed (although searching for topics in the group I couldn't find a solution to my problem).
I am using Python 2.7.3 to analyse the output of two 3rd parties programs that can be launched in a linux shell as:
program1 | program2
To do this I have written a function that pipes program1 and program2 (using subprocess.Popen) and the stdout of the subprocess, and a function that parses the output:
A basic example:
from subprocess import Popen, STDOUT, PIPE
def run():
p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr = STDOUT)
p1.stdout.close()
return p2.stdout
def parse(out):
for row in out:
print row
#do something else with each line
out.close()
return parsed_output
# main block here
pout = run()
parsed = parse(pout)
#--- END OF PROGRAM ----#
I want to parse the output of 'program1 | program2' line by line because the output is very large.
When running the code above, occasionally some error occurs (IOERROR: [Errno 0]). However this error doesn't occur if I code the run() function as:
def run():
p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout = PIPE)
return p.stdout
I really can't understand why the first version causes errors, while the second one doesn't.
Can you please help me understanding what's the difference between the two cases?
Thanks a lot in advance for the help,
Cheers, Luca
[toc] | [next] | [standalone]
| From | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| Date | 2013-05-26 03:31 -0700 |
| Message-ID | <53d92a77-14ed-4999-90ac-e1c00f9889d5@googlegroups.com> |
| In reply to | #45894 |
> > Can you please help me understanding what's the difference between the two cases? > Hi guys has some of you ideas on what is causing my issue?
[toc] | [prev] | [next] | [standalone]
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2013-05-26 14:05 -0700 |
| Message-ID | <mailman.2202.1369602328.3114.python-list@python.org> |
| In reply to | #45894 |
[Multipart message — attachments visible in raw view] — view raw
On May 24, 2013 7:06 AM, "Luca Cerone" <luca.cerone@gmail.com> wrote:
>
> Hi everybody,
> I am new to the group (and relatively new to Python)
> so I am sorry if this issues has been discussed (although searching for
topics in the group I couldn't find a solution to my problem).
>
> I am using Python 2.7.3 to analyse the output of two 3rd parties programs
that can be launched in a linux shell as:
>
> program1 | program2
>
> To do this I have written a function that pipes program1 and program2
(using subprocess.Popen) and the stdout of the subprocess, and a function
that parses the output:
>
> A basic example:
>
> from subprocess import Popen, STDOUT, PIPE
> def run():
> p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
> p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr =
STDOUT)
Could you provide the *actual* commands you're using, rather than the
generic "program1" and "program2" placeholders? It's *very* common for
people to get the tokenization of a command line wrong (see the Note box in
http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some
relevant advice).
> p1.stdout.close()
> return p2.stdout
>
>
> def parse(out):
> for row in out:
> print row
> #do something else with each line
> out.close()
> return parsed_output
>
>
> # main block here
>
> pout = run()
>
> parsed = parse(pout)
>
> #--- END OF PROGRAM ----#
>
> I want to parse the output of 'program1 | program2' line by line because
the output is very large.
>
> When running the code above, occasionally some error occurs (IOERROR:
[Errno 0]).
Could you provide the full & complete error message and exception traceback?
> However this error doesn't occur if I code the run() function as:
>
> def run():
> p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout
= PIPE)
> return p.stdout
>
> I really can't understand why the first version causes errors, while the
second one doesn't.
>
> Can you please help me understanding what's the difference between the
two cases?
One obvious difference between the 2 approaches is that the shell doesn't
redirect the stderr streams of the programs, whereas you /are/ redirecting
the stderrs to stdout in the non-shell version of your code. But this is
unlikely to be causing the error you're currently seeing.
You may also want to provide /dev/null as p1's stdin, out of an abundance
of caution.
Lastly, you may want to consider using a wrapper library such as
http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do
pipelining and other such "fancy" things with subprocesses, while still
avoiding the many perils of the shell.
Cheers,
Chris
--
Be patient; it's Memorial Day weekend.
[toc] | [prev] | [next] | [standalone]
| From | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| Date | 2013-05-26 16:58 -0700 |
| Message-ID | <08aa32b7-0fb7-4665-83fa-5cc7ec36f898@googlegroups.com> |
| In reply to | #46122 |
> Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice). > Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available. I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error.. Just to be clear I run the process like: p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ... the rest) which I think is the right way to pass arguments (it works fine for other commands).. > > Could you provide the full & complete error message and exception traceback? > yes, as soon as I get to my work laptop.. > > One obvious difference between the 2 approaches is that the shell doesn't redirect the stderr streams of the programs, whereas you /are/ redirecting the stderrs to stdout in the non-shell version of your code. But this is unlikely to be causing the error you're currently seeing. > > > You may also want to provide /dev/null as p1's stdin, out of an abundance of caution. > I tried to redirect the output to /dev/null using the Popen argument: 'stdin = os.path.devnull' (having imported os of course).. But this seemed to cause even more troubles... > Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell. > > Thanks, I didn't know this library, I'll give it a try. Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unix platform is OK). Thanks a lot for your help, Cheers, Luca
[toc] | [prev] | [next] | [standalone]
| From | Carlos Nepomuceno <carlosnepomuceno@outlook.com> |
|---|---|
| Date | 2013-05-27 03:14 +0300 |
| Message-ID | <mailman.2222.1369613672.3114.python-list@python.org> |
| In reply to | #46145 |
pipes usually consumes disk storage at '/tmp'. Are you sure you have enough room on that filesystem? Make sure no other processes are competing against for that space. Just my 50c because I don't know what's causing Errno 0. I don't even know what are the possible causes of such error. Good luck! ---------------------------------------- > Date: Sun, 26 May 2013 16:58:57 -0700 > Subject: Re: Piping processes works with 'shell = True' but not otherwise. > From: luca.cerone@gmail.com > To: python-list@python.org [...] > I tried to redirect the output to /dev/null using the Popen argument: > 'stdin = os.path.devnull' (having imported os of course).. > But this seemed to cause even more troubles... > >> Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell. >> >> > Thanks, I didn't know this library, I'll give it a try. > Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unix platform is OK). > > Thanks a lot for your help, > Cheers, > Luca > -- > http://mail.python.org/mailman/listinfo/python-list
[toc] | [prev] | [next] | [standalone]
| From | Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> |
|---|---|
| Date | 2013-05-29 19:39 +0200 |
| Message-ID | <ko5egs$knk$1@r01.glglgl.de> |
| In reply to | #46146 |
Am 27.05.2013 02:14 schrieb Carlos Nepomuceno: > pipes usually consumes disk storage at '/tmp'. Good that my pipes don't know about that. Why should that happen? Thomas
[toc] | [prev] | [next] | [standalone]
| From | Carlos Nepomuceno <carlosnepomuceno@outlook.com> |
|---|---|
| Date | 2013-05-29 22:31 +0300 |
| Message-ID | <mailman.2378.1369855873.3114.python-list@python.org> |
| In reply to | #46396 |
---------------------------------------- > From: nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de > Subject: Re: Piping processes works with 'shell = True' but not otherwise. > Date: Wed, 29 May 2013 19:39:40 +0200 > To: python-list@python.org > > Am 27.05.2013 02:14 schrieb Carlos Nepomuceno: >> pipes usually consumes disk storage at '/tmp'. > > Good that my pipes don't know about that. > > Why should that happen? > > > Thomas > -- > http://mail.python.org/mailman/listinfo/python-list Ooops! My mistake! We've been using 'tee' when in debugging mode and I though that would apply to this case. Nevermind!
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2013-05-30 08:18 +1000 |
| Message-ID | <mailman.2386.1369867287.3114.python-list@python.org> |
| In reply to | #46396 |
On 29May2013 19:39, Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> wrote: | Am 27.05.2013 02:14 schrieb Carlos Nepomuceno: | >pipes usually consumes disk storage at '/tmp'. | | Good that my pipes don't know about that. | Why should that happen? It probably doesn't on anything modern. On V7 UNIX at least there was a kernel notion of the "pipe fs", where pipe storage existed; usually /tmp; using small real (but unnamed) files is an easy way to implement them, especially on systems where RAM is very small and without a paging VM - for example, V7 UNIX ran on PDP-11s amongst other things. And files need a filesystem. But even then pipes are still small fixed length buffers; they don't grow without bound as you might have inferred from the quoted statement. Cheers, -- Cameron Simpson <cs@zip.com.au> ERROR 155 - You can't do that. - Data General S200 Fortran error code list
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-05-27 18:28 +1000 |
| Message-ID | <mailman.2244.1369643321.3114.python-list@python.org> |
| In reply to | #46145 |
On Mon, May 27, 2013 at 9:58 AM, Luca Cerone <luca.cerone@gmail.com> wrote: >> Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice). >> > Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available. > I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error.. Will it violate privacy / NDA to post the command line? Even if we can't actually replicate your system, we may be able to see something from the commands given. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| Date | 2013-05-27 04:33 -0700 |
| Message-ID | <4fd43d0e-9c72-41d0-8c8d-dbc0d0a50022@googlegroups.com> |
| In reply to | #46181 |
> > > Will it violate privacy / NDA to post the command line? Even if we > > can't actually replicate your system, we may be able to see something > > from the commands given. > > Unfortunately yes..
[toc] | [prev] | [next] | [standalone]
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2013-05-29 10:17 -0700 |
| Message-ID | <mailman.2366.1369847860.3114.python-list@python.org> |
| In reply to | #46145 |
On Sun, May 26, 2013 at 4:58 PM, Luca Cerone <luca.cerone@gmail.com> wrote: <snip> > Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available. > I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error.. > > Just to be clear I run the process like: > > p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ... the rest) > > which I think is the right way to pass arguments (it works fine for other commands).. <snip> >> You may also want to provide /dev/null as p1's stdin, out of an abundance of caution. > > I tried to redirect the output to /dev/null using the Popen argument: > 'stdin = os.path.devnull' (having imported os of course).. > But this seemed to cause even more troubles... That's because stdin/stdout/stderr take file descriptors or file objects, not path strings. Cheers, Chris
[toc] | [prev] | [next] | [standalone]
| From | Luca Cerone <luca.cerone@gmail.com> |
|---|---|
| Date | 2013-05-31 02:28 -0700 |
| Message-ID | <8c89fb10-63eb-446c-855b-f4c5406976ea@googlegroups.com> |
| In reply to | #46395 |
> > That's because stdin/stdout/stderr take file descriptors or file > > objects, not path strings. > Thanks Chris, how do I set the file descriptor to /dev/null then?
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2013-05-31 11:52 +0200 |
| Message-ID | <mailman.2480.1369993961.3114.python-list@python.org> |
| In reply to | #46585 |
Luca Cerone wrote:
>>
>> That's because stdin/stdout/stderr take file descriptors or file
>>
>> objects, not path strings.
>>
>
> Thanks Chris, how do I set the file descriptor to /dev/null then?
For example:
with open(os.devnull, "wb") as stderr:
p = subprocess.Popen(..., stderr=stderr)
...
In Python 3.3 and above:
p = subprocess.Popen(..., stderr=subprocess.DEVNULL)
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web