From: Neil Cerutti Newsgroups: comp.lang.python Subject: Re: how to avoid leading white spaces Date: 6 Jun 2011 16:08:05 GMT Organization: Norwich University Lines: 54 Message-ID: <954cb5F5qbU1@mid.individual.net> References: <9e861b0e-e768-401b-b5ca-190f20830a08@s9g2000yqm.googlegroups.com> <94ph22FrhvU5@mid.individual.net> <4de8eef1$0$29996$c3e8da3$5496439d@news.astraweb.com> <1237a287-10b0-4a2d-ba35-97b5238deda1@n11g2000yqf.googlegroups.com> <94svm4Fe7eU1@mid.individual.net> <65164054-f11d-4f8e-a141-31513e70ca04@h9g2000yqk.googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: individual.net KNr1Ozrqk7jrVe/5mhPiuwxmw+dZbqz4ywPxrVrOMCv64MXMyk Cancel-Lock: sha1:T1O4d039JeFVgEFE8QUBbEw/1Jw= User-Agent: slrn/0.9.9p1/mm/ao (Win32) Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.stben.net!border3.nntp.ams.giganews.com!border1.nntp.ams.giganews.com!nntp.giganews.com!news.addix.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:7090 On 2011-06-06, rurpy@yahoo.com wrote: > On 06/03/2011 02:49 PM, Neil Cerutti wrote: > Can you find an example or invent one? I simply don't remember > such problems coming up, but I admit it's possible. > > Sure, the response to the OP of this thread. Here's a recap, along with two candidate solutions, one based on your recommendation, and one using str functions and slicing. (I fixed a specification problem in your original regex, as one of the lines of data contained a space after the closing ', making the $ inappropriate) data.txt: //ACCDJ EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB, // UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCDJ ' //ACCT EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB, // UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCT ' //ACCUM EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB, // UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM ' //ACCUM1 EXEC DB2UNLDC,DFLID=&DFLID,PARMLIB=&PARMLIB, // UNLDSYST=&UNLDSYST,DATABAS=MBQV1D0A,TABLE='ACCUM1 ' ^Z import re print("re solution") with open("data.txt") as f: for line in f: fixed = re.sub(r"(TABLE='\S+)\s+'", r"\1'", line) print(fixed, end='') print("non-re solution") with open("data.txt") as f: for line in f: i = line.find("TABLE='") if i != -1: begin = line.index("'", i) + 1 end = line.index("'", begin) field = line[begin: end].rstrip() print(line[:i] + line[i:begin] + field + line[end:], end='') else: print(line, end='') These two solutions print identical output processing the sample data. Slight changes in the data would reveal divergence in the assumptions each solution made. I agree with you that this is a very tempting candidate for re.sub, and if it probably would have been my first try as well. -- Neil Cerutti