Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.020 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'output': 0.05; 'column': 0.07; 'part,': 0.09; 'trailing': 0.09; 'python': 0.11; "'w')": 0.16; 'appreciated!': 0.16; 'components.': 0.16; 'csv': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:addr:python': 0.16; 'from:name:mrab': 0.16; 'idea:': 0.16; 'message- id:@mrabarnett.plus.com': 0.16; 'received:192.168.1.4': 0.16; 'skip:o 100': 0.16; 'splitting': 0.16; 'wrote:': 0.18; 'file,': 0.19; 'header:User-Agent:1': 0.23; "shouldn't": 0.24; 'skip:l 30': 0.24; 'script': 0.25; 'code:': 0.26; 'header:In-Reply-To:1': 0.27; 'said,': 0.30; 'code': 0.31; 'file:': 0.31; 'file': 0.32; 'there.': 0.32; 'supposed': 0.32; 'text': 0.33; 'addresses': 0.33; 'trouble': 0.34; 'could': 0.34; 'skip:s 30': 0.35; 'there': 0.35; 'addresses,': 0.36; 'module.': 0.36; 'done': 0.36; "i'll": 0.36; 'being': 0.38; 'skip:o 20': 0.38; 'to:addr:python-list': 0.38; 'issue': 0.38; 'sure': 0.39; 'to:addr:python.org': 0.39; 'address.': 0.39; 'mailing': 0.39; 'space': 0.40; 'address': 0.63; 'email addr:gmail.com': 0.63; 'different': 0.65; 'city': 0.66; 'here': 0.66; 'postal': 0.74; 'hunter': 0.84; 'hutchinson': 0.84; 'iterative': 0.84; 'stone': 0.84; 'subject:Unwanted': 0.84; 'rd,': 0.91 X-CM-Score: 0.00 X-CNFS-Analysis: v=2.1 cv=CfYxutbl c=1 sm=1 tr=0 a=0nF1XD0wxitMEM03M9B4ZQ==:117 a=0nF1XD0wxitMEM03M9B4ZQ==:17 a=0Bzu9jTXAAAA:8 a=5FYZ9MsUIQAA:10 a=-OvPoVNOQDUA:10 a=ihvODaAuJD4A:10 a=IkcTkHD0fZMA:10 a=EBOSESyhAAAA:8 a=81K2m0c4REEA:10 a=pGLkceISAAAA:8 a=g66N_Bv7YocCBp0GCcsA:9 a=whnKgHFxgrv22c6U:21 a=vxuoS_SQ6UCW-X2p:21 a=QEXdDO2ut3YA:10 a=MSl-tDqOz04A:10 X-AUTH: mrabarnett:2500 Date: Sun, 26 Jan 2014 23:28:37 +0000 From: MRAB User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Unwanted Spaces and Iterative Loop References: <988fec60-228a-4427-b07e-b4327c7e02ae@googlegroups.com> In-Reply-To: <988fec60-228a-4427-b07e-b4327c7e02ae@googlegroups.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 64 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1390778925 news.xs4all.nl 2830 [2001:888:2000:d::a6]:49347 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:64801 On 2014-01-26 21:46, matt.s.marotta@gmail.com wrote: > I have been working on a python script that separates mailing addresses into different components. > > Here is my code: > > inFile = "directory" > outFile = "directory" > inHandler = open(inFile, 'r') > outHandler = open(outFile, 'w') Shouldn't you be writing a '\n' at the end of the line? > outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode") > for line in inHandler: This is being done on every single line of the file: > str = line.replace("FarmID\tAddress", " ") > outHandler.write(str[0:-1]) > > str = str.replace(" ","\t", 1) > str = str.replace(" Rd,","\tRd\t\t") > str = str.replace(" Rd","\tRd\t") > str = str.replace("Ave,","\tAve\t\t") > str = str.replace("Ave ","\tAve\t\t") > str = str.replace("St ","\tSt\t\t") > str = str.replace("St,","\tSt\t\t") > str = str.replace("Dr,","\tDr\t\t") > str = str.replace("Lane,","\tLane\t\t") > str = str.replace("Pky,","\tPky\t\t") > str = str.replace(" Sq,","\tSq\t\t") > str = str.replace(" Pl,","\tPl\t\t") > > str = str.replace("\tE,","E\t") > str = str.replace("\tN,","N\t") > str = str.replace("\tS,","S\t") > str = str.replace("\tW,","W\t") > str = str.replace(",","\t") > str = str.replace(" ON","ON\t") > > > outHandler.write(str) > inHandler.close() > > The text file that this manipulates has 91 addresses, so I'll just paste 5 of them in here to get the idea: > > FarmID Address > 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0 > 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2 > 3 25 Hunter Rd, Grimsby, ON L3M 4A3 > 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0 > > My issue is that in the output file, there is a space before each city and each postal code that I do not want there. > You could try splitting on '\t', stripping the leading and trailing whitespace on each part, and then joining them together again with '\t'. (Make sure that you also write the '\n' at the end of line.) > Furthermore, the FarmID is being added on to the end of the postal code under the original address column for each address. This also is not supposed to be happening, and I am having trouble designing an iterative loop to remove/prevent that from happening. > > Any help is greatly appreciated! > As Mark said, you could also use the CSV module.