Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #101738

Re: ignoring or replacing white lines in a diff

From "Adriaan Renting" <renting@astron.nl>
Newsgroups comp.lang.python
Subject Re: ignoring or replacing white lines in a diff
Date 2016-01-15 10:44 +0100
Message-ID <mailman.3.1452851060.15297.python-list@python.org> (permalink)
References <mailman.159.1452786947.13488.python-list@python.org> <5697f08f$0$23822$e4fe514c@news.xs4all.nl> <5698118F0200001B0003B68D@gwsmtp1.astron.nl> <n792jc$j7e$1@ger.gmane.org>

Show all headers | View raw


Thanks for the various people that provided help.

Peter Otten provided me with a working solution:

I had to split the "-I '^[[:space:]]*$'" into two commands.

      cmd   = ["diff", "-w", "-I", r"^[[:space:]]*$", "./xml/%s.xml" %
name, "test.xml"]
      p     = subprocess.Popen(cmd, stdin=open('/dev/null'),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
      logs  = p.communicate()
      diffs = logs[0].splitlines() #stdout

This also works:

      cmd   = ["diff -w -I '^[[:space:]]*$' ./xml/%s.xml test.xml" %
name]
      p     = subprocess.Popen(cmd, stdin=open('/dev/null'),
stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
      logs  = p.communicate()
      diffs = logs[0].splitlines() #stdout

As to other comments:

- I've found that stdin=open('/dev/null') is essential in
subprocess.Popen to make it work from automated (headless) scripts.
- print line, did remove the extra newlines, but didn't get rid of the
blank lines.
- making it a raw string with r"-I '^[[:space:]]*$'" made no difference
(also tried r"-I ^[[:space:]]*$")
- I didn't investigate difflib further but will keep it in mind for the
future.

Thank you for your help,

Adriaan.


Adriaan Renting        | Email: renting@astron.nl
Software Engineer Radio Observatory
ASTRON                 | Phone: +31 521 595 100 (797 direct)
P.O. Box 2             | GSM:   +31 6 24 25 17 28
NL-7990 AA Dwingeloo   | FAX:   +31 521 595 101
The Netherlands        | Web: http://www.astron.nl/~renting/



>>> On 14-1-2016 at 22:05, Peter Otten <__peter__@web.de> wrote: 
> Adriaan Renting wrote:
> 
>> 
>> Maybe someone here has a clue what is going wrong here? Any help is
>> appreciated.
>> 
>> I'm writing a regression test for a module that generates XML.
>> 
>> I'm using diff to compare the results with a pregenerated one from
an
>> earlier version.
>> 
>> I'm running into two problems:
>> 
>> The diff doesn't seem to behave properly with the -B option. (diff
(GNU
>> diffutils) 2.8.1 on OSX 10.9)
>> 
>> Replacing -B with -I '^[[:space:]]*$' fixes it on the command line,
>> which should be exactly the same according to:
>> 
>
http://www.gnu.org/software/diffutils/manual/html_node/Blank-Lines.html#Blank-L
> ines
>> 
>> (for Python problem continue below)
>> 
>> MacRenting 21:00-159> diff -w -B test.xml xml/Ticket_6923.xml
>> 3,5c3,5
>> <   <version>2.15.0</version>
>> <   <template version="2.15.0" author="Alwin de Jong,Adriaan
Renting"
>> changedBy="Adriaan Renting">
>> <   <description>XML Template generator version
2.15.0</description>
>> ---
>>>           <version>2.6.0</version>
>>>           <template version="2.6.0" author="Alwin de Jong"
>> changedBy="Alwin de Jong">
>>>           <description>XML Template generator version
>> 2.6.0</description>
>> 113d112
>> <
>> 163d161
>> <
>> 213d210
>> <
>> 258d254
>> <
>> 369d364
>> <
>> 419d413
>> <
>> 469d462
>> <
>> 514d506
>> <
>> 625d616
>> <
>> 675d665
>> <
>> 725d714
>> <
>> 770d758
>> <
>> 881d868
>> <
>> 931d917
>> <
>> 981d966
>> <
>> 1026d1010
>> <
>> 1137d1120
>> <
>> 1187d1169
>> <
>> 1237d1218
>> <
>> 1282d1262
>> <
>> 
>> /Users/renting/src/CEP4-DevelopClusterModel-Story-Task8432-
> SAS/XML_generator/test
>> MacRenting 21:00-160> diff -w -I '^[[:space:]]*$' test.xml
>> xml/Ticket_6923.xml
>> 3,5c3,5
>> <   <version>2.15.0</version>
>> <   <template version="2.15.0" author="Alwin de Jong,Adriaan
Renting"
>> changedBy="Adriaan Renting">
>> <   <description>XML Template generator version
2.15.0</description>
>> ---
>>>           <version>2.6.0</version>
>>>           <template version="2.6.0" author="Alwin de Jong"
>> changedBy="Alwin de Jong">
>>>           <description>XML Template generator version
>> 2.6.0</description>
>> 
>> 
>> Now I try to use this in Python:
>> 
>>       cmd   = ["diff", "-w", "-I '^[[:space:]]*$'", "./xml/%s.xml"
%
>> name, "test.xml"]
> 
> Instead of 
> 
> ..., "-I '^[[:space:]]*$'", ...
> 
> try two separate arguments
> 
> ..., "-I", "^[[:space:]]*$", ...
> 
>>       ## -w ignores differences in whitespace
>>       ## -I '^[[:space:]]*$' because -B doesn't work for blank
lines
>> (on OSX?)
>>       p     = subprocess.Popen(cmd, stdin=open('/dev/null'),
>> stdout=subprocess.PIPE, stderr=subprocess.PIPE)
> 
> I don't think you need to specify stdin.
> 
>>       logs  = p.communicate()
>>       diffs = logs[0].splitlines() #stdout
>>       print "diff reply was %i lines long" % len(diffs)
>> 
>> This doesn't work. I've tried escaping the various bits, like the *
and
>> $, even though with single quotes that should not be needed.
>> 
>> I tried first removing the blank lines from the file:
>> 
>>       import fileinput
>>       for line in fileinput.FileInput("test.xml",inplace=1):
>>         if line.rstrip():
>>           print line
>> 
>> This makes it worse, as it adds and empty line for each line in the
>> file.
> 
> Add a trailing comma to suppress the newline:
> 
> print line,
> 
>> I've tried various other options. The only thing I can think of, is
>> ditching Python and trying to rewrite the whole script in Bash.
>> (It's quite complicated, as it loops over various things and does
some
>> pretty output in between and I'm not very fluent in Bash)
>> 
>> Any suggestions?
> 
> Whatever floats your boat ;)

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

problem Shivam Gupta <mailtoshivamgupta@gmail.com> - 2016-01-14 21:23 +0530
  Re: problem Irmen de Jong <irmen.NOSPAM@xs4all.nl> - 2016-01-14 20:01 +0100
    ignoring or replacing white lines in a diff "Adriaan Renting" <renting@astron.nl> - 2016-01-14 21:22 +0100
    Re: ignoring or replacing white lines in a diff Zachary Ware <zachary.ware+pylist@gmail.com> - 2016-01-14 14:54 -0600
    Re: ignoring or replacing white lines in a diff Nathan Hilterbrand <nhilterbrand@gmail.com> - 2016-01-14 15:57 -0500
    Re: ignoring or replacing white lines in a diff "Martin A. Brown" <martin@linux-ip.net> - 2016-01-14 12:57 -0800
    Re: ignoring or replacing white lines in a diff Peter Otten <__peter__@web.de> - 2016-01-14 22:05 +0100
    Re: problem Chris Angelico <rosuav@gmail.com> - 2016-01-15 08:10 +1100
    Re: problem eryk sun <eryksun@gmail.com> - 2016-01-14 15:39 -0600
    Re: ignoring or replacing white lines in a diff "Adriaan Renting" <renting@astron.nl> - 2016-01-15 10:44 +0100
    Re: ignoring or replacing white lines in a diff Peter Otten <__peter__@web.de> - 2016-01-15 11:20 +0100

csiph-web