From: Neil Cerutti <neilc@norwich.edu>
Newsgroups: comp.lang.python
Subject: Re: how to avoid leading white spaces
Date: 6 Jun 2011 16:08:05 GMT
Organization: Norwich University
Lines: 54
Message-ID: <954cb5F5qbU1@mid.individual.net>
References: <BANLkTikjY3U9Y24s-GOEyi8CNqCFLXuG6g@mail.gmail.com> <mailman.2373.1306948264.9059.python-list@python.org> <9e861b0e-e768-401b-b5ca-190f20830a08@s9g2000yqm.googlegroups.com> <94ph22FrhvU5@mid.individual.net> <bc814b92-82f1-4fca-9282-c22bfafb3cae@j23g2000yqc.googlegroups.com> <4de8eef1$0$29996$c3e8da3$5496439d@news.astraweb.com> <1237a287-10b0-4a2d-ba35-97b5238deda1@n11g2000yqf.googlegroups.com> <94svm4Fe7eU1@mid.individual.net> <65164054-f11d-4f8e-a141-31513e70ca04@h9g2000yqk.googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: individual.net KNr1Ozrqk7jrVe/5mhPiuwxmw+dZbqz4ywPxrVrOMCv64MXMyk
Cancel-Lock: sha1:T1O4d039JeFVgEFE8QUBbEw/1Jw=
User-Agent: slrn/0.9.9p1/mm/ao (Win32)
Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.stben.net!border3.nntp.ams.giganews.com!border1.nntp.ams.giganews.com!nntp.giganews.com!news.addix.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:7090

On 2011-06-06, rurpy@yahoo.com <rurpy@yahoo.com> wrote:
> On 06/03/2011 02:49 PM, Neil Cerutti wrote:
> Can you find an example or invent one? I simply don't remember
> such problems coming up, but I admit it's possible.
>
> Sure, the response to the OP of this thread.

Here's a recap, along with two candidate solutions, one based on
your recommendation, and one using str functions and slicing. 

(I fixed a specification problem in your original regex, as one
of the lines of data contained a space after the closing ',
making the $ inappropriate)

data.txt:
//ACCDJ         EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB,
//         UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ       '
//ACCT          EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB,
//         UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCT        '
//ACCUM         EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB,
//         UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM       '
//ACCUM1        EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB,
//         UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM1      ' 
^Z

import re

print("re solution")
with open("data.txt") as f:
    for line in f:
        fixed = re.sub(r"(TABLE='\S+)\s+'", r"\1'", line)
        print(fixed, end='')

print("non-re solution")
with open("data.txt") as f:
    for line in f:
        i = line.find("TABLE='")
        if i != -1:
            begin = line.index("'", i) + 1
            end = line.index("'", begin)
            field = line[begin: end].rstrip()
            print(line[:i] + line[i:begin] + field + line[end:], end='')
        else:
            print(line, end='')

These two solutions print identical output processing the sample
data. Slight changes in the data would reveal divergence in the
assumptions each solution made.

I agree with you that this is a very tempting candidate for
re.sub, and if it probably would have been my first try as well.

-- 
Neil Cerutti