Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #19014 > unrolled thread

Problem while doing a cat on a tabbed file with pexpect

Started bySaqib Ali <saqib.ali.75@gmail.com>
First post2012-01-15 09:51 -0800
Last post2012-01-16 02:01 +0000
Articles 10 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Problem while doing a cat on a tabbed file with pexpect Saqib Ali <saqib.ali.75@gmail.com> - 2012-01-15 09:51 -0800
    Re: Problem while doing a cat on a tabbed file with pexpect Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-01-15 14:24 -0500
      Re: Problem while doing a cat on a tabbed file with pexpect Saqib Ali <saqib.ali.75@gmail.com> - 2012-01-15 16:11 -0800
        Re: Problem while doing a cat on a tabbed file with pexpect Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-01-15 22:35 -0500
        Re: Problem while doing a cat on a tabbed file with pexpect Michael Torrie <torriem@gmail.com> - 2012-01-15 20:47 -0700
    Re: Problem while doing a cat on a tabbed file with pexpect Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-01-15 23:04 +0000
      Re: Problem while doing a cat on a tabbed file with pexpect Cameron Simpson <cs@zip.com.au> - 2012-01-16 10:40 +1100
        Re: Problem while doing a cat on a tabbed file with pexpect Saqib Ali <saqib.ali.75@gmail.com> - 2012-01-15 16:14 -0800
          Re: Problem while doing a cat on a tabbed file with pexpect Cameron Simpson <cs@zip.com.au> - 2012-01-16 12:45 +1100
    Re: Problem while doing a cat on a tabbed file with pexpect Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-01-16 02:01 +0000

#19014 — Problem while doing a cat on a tabbed file with pexpect

FromSaqib Ali <saqib.ali.75@gmail.com>
Date2012-01-15 09:51 -0800
SubjectProblem while doing a cat on a tabbed file with pexpect
Message-ID<d33374b9-de51-4cfa-b05d-0bb13fdf79f3@f11g2000yql.googlegroups.com>
I am using Solaris 10, python 2.6.2, pexpect 2.4

I create a file called me.txt which contains the letters "A", "B", "C"
on the same line separated by tabs.

My shell prompt is "% "

I then do the following in the python shell:


>>> import pexpect
>>> x = pexpect.spawn("/bin/tcsh")
>>> x.sendline("cat me.txt")
11
>>> x.expect([pexpect.TIMEOUT, "% "])
1
>>> x.before
'cat me.txt\r\r\nA       B       C\r\n'
>>> x.before.split("\t")
['cat me.txt\r\r\nA       B       C\r\n']



Now, clearly the file contains tabs. But when I cat it through expect,
and collect cat's output, those tabs have been converted to spaces.
But I need the tabs!

Can anyone explain this phenomenon or suggest how I can fix it?

[toc] | [next] | [standalone]


#19015

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-01-15 14:24 -0500
Message-ID<mailman.4778.1326655502.27778.python-list@python.org>
In reply to#19014
On Sun, 15 Jan 2012 09:51:44 -0800 (PST), Saqib Ali
<saqib.ali.75@gmail.com> wrote:


>Now, clearly the file contains tabs. But when I cat it through expect,
>and collect cat's output, those tabs have been converted to spaces.
>But I need the tabs!
>
>Can anyone explain this phenomenon or suggest how I can fix it?

	My question is:

	WHY are you doing this?

	Based upon the problem discription, as given, the solution would
seem to be to just open the file IN Python -- whether you read the lines
and use split() by hand, or pass the open file to the csv module for
reading/parsing is up to you.

-=-=-=-=-=-=-
import csv
import os

TESTFILE = "Test.tsv"

#create data file
fout = open(TESTFILE, "w")
for ln in [  "abc",
            "defg",
            "hijA"  ]:
    fout.write("\t".join(list(ln)) + "\n")
fout.close()

#process tab-separated data
fin = open(TESTFILE, "rb")
rdr = csv.reader(fin, dialect="excel-tab")
for rw in rdr:
    print rw

fin.close()
del rdr
os.remove(TESTFILE)
-=-=-=-=-=-=-
['a', 'b', 'c']
['d', 'e', 'f', 'g']
['h', 'i', 'j', 'A']
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#19021

FromSaqib Ali <saqib.ali.75@gmail.com>
Date2012-01-15 16:11 -0800
Message-ID<4008c5d1-98c5-4796-9e82-0e8835fbe42a@i25g2000vbt.googlegroups.com>
In reply to#19015
Very good question. Let me explain why I'm not opening me.txt directly
in python with open.

The example I have posted is simplified for illustrative purpose. In
reality, I'm not doing pexpect.spawn("/bin/tcsh"). I'm doing
pexpect.spawn("ssh myuser@ipaddress"). Since I'm operating on a remote
system, I can't simply open the file in my own python context.


On Jan 15, 2:24 pm, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Sun, 15 Jan 2012 09:51:44 -0800 (PST), Saqib Ali
>
> <saqib.ali...@gmail.com> wrote:
> >Now, clearly the file contains tabs. But when I cat it through expect,
> >and collect cat's output, those tabs have been converted to spaces.
> >But I need the tabs!
>
> >Can anyone explain this phenomenon or suggest how I can fix it?
>
>         My question is:
>
>         WHY are you doing this?
>
>         Based upon the problem discription, as given, the solution would
> seem to be to just open the file IN Python -- whether you read the lines
> and use split() by hand, or pass the open file to the csv module for
> reading/parsing is up to you.
>
> -=-=-=-=-=-=-
> import csv
> import os
>
> TESTFILE = "Test.tsv"
>
> #create data file
> fout = open(TESTFILE, "w")
> for ln in [  "abc",
>             "defg",
>             "hijA"  ]:
>     fout.write("\t".join(list(ln)) + "\n")
> fout.close()
>
> #process tab-separated data
> fin = open(TESTFILE, "rb")
> rdr = csv.reader(fin, dialect="excel-tab")
> for rw in rdr:
>     print rw
>
> fin.close()
> del rdr
> os.remove(TESTFILE)
> -=-=-=-=-=-=-
> ['a', 'b', 'c']
> ['d', 'e', 'f', 'g']
> ['h', 'i', 'j', 'A']
> --
>         Wulfraed                 Dennis Lee Bieber         AF6VN
>         wlfr...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#19027

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-01-15 22:35 -0500
Message-ID<mailman.4784.1326684966.27778.python-list@python.org>
In reply to#19021
On Sun, 15 Jan 2012 16:11:27 -0800 (PST), Saqib Ali
<saqib.ali.75@gmail.com> wrote:

>
>Very good question. Let me explain why I'm not opening me.txt directly
>in python with open.
>
>The example I have posted is simplified for illustrative purpose. In
>reality, I'm not doing pexpect.spawn("/bin/tcsh"). I'm doing
>pexpect.spawn("ssh myuser@ipaddress"). Since I'm operating on a remote
>system, I can't simply open the file in my own python context.
>
	Ah... Now we are outside of my experience... And into the realms of
how your remote host is handling tab characters when sent to an
(apparent) console (stdout not redirected to a file)... That is, does it
expand \t into spaces on the next multiple of 8 characters, or some
other size, rather than issue the tab character itself. 

	Is it an actual file on the remote end, or the output from some
interactive session? If an actual file, can you find an alternate
command to transfer the file to a local path for file processing (ftp,
scp, ?).
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#19028

FromMichael Torrie <torriem@gmail.com>
Date2012-01-15 20:47 -0700
Message-ID<mailman.4785.1326685659.27778.python-list@python.org>
In reply to#19021
On 01/15/2012 05:11 PM, Saqib Ali wrote:
> 
> Very good question. Let me explain why I'm not opening me.txt directly
> in python with open.
> 
> The example I have posted is simplified for illustrative purpose. In
> reality, I'm not doing pexpect.spawn("/bin/tcsh"). I'm doing
> pexpect.spawn("ssh myuser@ipaddress"). Since I'm operating on a remote
> system, I can't simply open the file in my own python context.

There is a very nice python module called "paramiko" that you could use
to, from python, programatically ssh to the remote system and cat the
file (bypassing any shells) or use sftp to access it.  Either way you
don't need to use pexpect with it.

[toc] | [prev] | [next] | [standalone]


#19018

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-01-15 23:04 +0000
Message-ID<4f135b65$0$29987$c3e8da3$5496439d@news.astraweb.com>
In reply to#19014
On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:

> I am using Solaris 10, python 2.6.2, pexpect 2.4
> 
> I create a file called me.txt which contains the letters "A", "B", "C"
> on the same line separated by tabs.
[...]
> Now, clearly the file contains tabs.

That is not clear at all. How do you know it contains tabs? How was the 
file created in the first place?

Try this:

text = open('me.txt', 'r').read()
print '\t' in text

My guess is that it will print False and that the file does not contain 
tabs. Check your editor used to create the file.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#19020

FromCameron Simpson <cs@zip.com.au>
Date2012-01-16 10:40 +1100
Message-ID<mailman.4781.1326670827.27778.python-list@python.org>
In reply to#19018
On 15Jan2012 23:04, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
| On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:
| > I am using Solaris 10, python 2.6.2, pexpect 2.4
| > 
| > I create a file called me.txt which contains the letters "A", "B", "C"
| > on the same line separated by tabs.
| [...]
| > Now, clearly the file contains tabs.
| 
| That is not clear at all. How do you know it contains tabs? How was the 
| file created in the first place?
| 
| Try this:
| 
| text = open('me.txt', 'r').read()
| print '\t' in text
| 
| My guess is that it will print False and that the file does not contain 
| tabs. Check your editor used to create the file.

I was going to post an alternative theory but on more thought I think
Steven is right here.

What does:

  od -c me.txt

show you? TABs or multiple spaces?

What does:

  ls -ld me.txt

tell you about the file size? Is it 6 bytes long (three letters, two
TABs, one newline)?

Steven hasn't been explicit about it, but some editors will write spaces when
you type a TAB. I have configured mine to do so - it makes indentation more
reliable for others. If I really need a TAB character I have a special
finger contortion to get one, but the actual need is rare.

So first check that the file really does contain TABs.

Cheers,
-- 
Cameron Simpson <cs@zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Yes Officer, yes Officer, I will Officer. Thank you.

[toc] | [prev] | [next] | [standalone]


#19022

FromSaqib Ali <saqib.ali.75@gmail.com>
Date2012-01-15 16:14 -0800
Message-ID<d7eb5c72-f6e5-4776-b91f-79efd00931fa@o13g2000vbf.googlegroups.com>
In reply to#19020
The file me.txt does indeed contain tabs. I created it with vi.

>>> text = open("me.txt", "r").read()
>>> print "\t" in text
True


% od -c me.txt
0000000   A  \t   B  \t   C  \n
0000006


% ls -al me.txt
-rw-r--r--   1 myUser    myGroup   6 Jan 15 12:42 me.txt



On Jan 15, 6:40 pm, Cameron Simpson <c...@zip.com.au> wrote:
> On 15Jan2012 23:04, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote:
> | On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:
> | > I am using Solaris 10, python 2.6.2, pexpect 2.4
> | >
> | > I create a file called me.txt which contains the letters "A", "B", "C"
> | > on the same line separated by tabs.
> | [...]
> | > Now, clearly the file contains tabs.
> |
> | That is not clear at all. How do you know it contains tabs? How was the
> | file created in the first place?
> |
> | Try this:
> |
> | text = open('me.txt', 'r').read()
> | print '\t' in text
> |
> | My guess is that it will print False and that the file does not contain
> | tabs. Check your editor used to create the file.
>
> I was going to post an alternative theory but on more thought I think
> Steven is right here.
>
> What does:
>
>   od -c me.txt
>
> show you? TABs or multiple spaces?
>
> What does:
>
>   ls -ld me.txt
>
> tell you about the file size? Is it 6 bytes long (three letters, two
> TABs, one newline)?
>
> Steven hasn't been explicit about it, but some editors will write spaces when
> you type a TAB. I have configured mine to do so - it makes indentation more
> reliable for others. If I really need a TAB character I have a special
> finger contortion to get one, but the actual need is rare.
>
> So first check that the file really does contain TABs.
>
> Cheers,
> --
> Cameron Simpson <c...@zip.com.au> DoD#743http://www.cskk.ezoshosting.com/cs/
>
> Yes Officer, yes Officer, I will Officer. Thank you.

[toc] | [prev] | [next] | [standalone]


#19024

FromCameron Simpson <cs@zip.com.au>
Date2012-01-16 12:45 +1100
Message-ID<mailman.4782.1326678340.27778.python-list@python.org>
In reply to#19022
On 15Jan2012 16:14, Saqib Ali <saqib.ali.75@gmail.com> wrote:
| The file me.txt does indeed contain tabs. I created it with vi.
| 
| >>> text = open("me.txt", "r").read()
| >>> print "\t" in text
| True
| 
| % od -c me.txt
| 0000000   A  \t   B  \t   C  \n
| 0000006
| 
| % ls -al me.txt
| -rw-r--r--   1 myUser    myGroup   6 Jan 15 12:42 me.txt

Ok, your file does indeed contain TABs.

Therefre something is turning the TABs into spaces. Pexpect should be
opening a pty and reading from that, and I do not expect that to expand
TABs. So:

  1: Using subprocess.Popen, invoke "cat me.txt" and check the result
     for TABs.

  2: Using pexpect, run "cat me.txt" instead of "/bin/tcsh" (eliminates a
     layer of complexity; I don't actually expect changed behaviour) and
     check for TABs.

On your Solaris system, read "man termios". Does it have an "expand
TABs" mode switch? This is about the only thing I can think of that
would produce your result - the pty terminal discipline is expanding
TABs for your (unwanted!) - cat is writing TABs to the terminal and the
terminal is passing expanded spaces to pexpect. Certainly terminal line
disciplines do rewrite stuff, most obviously "\n" into "\r\n", but a
quick glance through termios on a Linux box does not show a tab
expansion mode; I do not have access to a Solaris box at present.

Cheers,
-- 
Cameron Simpson <cs@zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Maintainer's Motto: If we can't fix it, it ain't broke.

[toc] | [prev] | [next] | [standalone]


#19025

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-01-16 02:01 +0000
Message-ID<4f1384ff$0$29881$c3e8da3$5496439d@news.astraweb.com>
In reply to#19014
On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:

> I am using Solaris 10, python 2.6.2, pexpect 2.4

Are you sure about that? As far as I can see, pexpect's current version 
is 2.3 not 2.4.


> I create a file called me.txt which contains the letters "A", "B", "C"
> on the same line separated by tabs.
> 
> My shell prompt is "% "
> 
> I then do the following in the python shell:
> 
> 
>>>> import pexpect
>>>> x = pexpect.spawn("/bin/tcsh")

Can you try another shell, just in case tcsh is converting the tabs to 
spaces?

>>>> x.sendline("cat me.txt")
> 11

What happens if you do this from the shell directly, without pexpect? It 
is unlikely, but perhaps the problem lies with cat rather than pexpect. 
You should eliminate this possibility.


>>>> x.expect([pexpect.TIMEOUT, "% "])
> 1
>>>> x.before
> 'cat me.txt\r\r\nA       B       C\r\n'


Unfortunately I can't replicate the same behaviour, however my setup is 
different. I'm using pexpect2.3 on Linux, and I tried it using bash and 
sh but not tcsh. In all my tests, the tabs were returned as expected.

(However, the x.expect call returned 0 instead of 1, even with the shell 
prompt set correctly.)



-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web