Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'subject:Python': 0.04; '"done"': 0.09; 'from:addr:python': 0.09; 'output': 0.12; 'wrote:': 0.14; '"r")': 0.16; '"w")': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.92': 0.16; 'received:84.92.122': 0.16; 'received:84.92.122.60': 0.16; 'reply- to:addr:python-list': 0.16; 'subject:Convert': 0.16; 'subject:regex': 0.16; 'seeing': 0.21; 'header:In-Reply-To:1': 0.22; 'received:84': 0.25; 'skip:# 10': 0.25; 'string': 0.29; 'probably': 0.30; 'filtering': 0.31; 'import': 0.32; 'to:addr :python-list': 0.32; 'module': 0.33; 'lines': 0.34; 'file': 0.35; 'print': 0.35; 'header:User-Agent:1': 0.35; 'reply- to:addr:python.org': 0.35; 'faster': 0.38; 'log': 0.38; 'to:addr:python.org': 0.39; 'how': 0.39; "it's": 0.40; 'charset:windows-1252': 0.61; 'reply-to:no real name:2**0': 0.72; 'header:Reply-To:1': 0.72 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuUHAMNS0U3Unw4S/2dsb2JhbACYDY4Md8cOhhkElECKRw Date: Mon, 16 May 2011 17:39:11 +0100 From: MRAB User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Convert AWK regex to Python References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: python-list@python.org List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 41 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1305563954 news.xs4all.nl 49037 [::ffff:82.94.164.166]:35292 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:5525 On 16/05/2011 09:19, J wrote: [snip] > #!/usr/bin/python > > # Import RegEx module > import re as regex > # Log file to work on > filetoread = open('/tmp/ pdu_log.log', "r") > # File to write output to > filetowrite = file('/tmp/ pdu_log_clean.log', "w") > # Perform filtering in the log file > linetoread = filetoread.readlines() > for line in linetoread: > filter0 = regex.sub(r" filter1 = regex.sub(r"\."," ",filter0) > # Write new log file > filetowrite.write(filter1) > filetowrite.close() > # Read new log and get required fields from it > filtered_log = open('/tmp/ pdu_log_clean.log', "r") > filtered_line = filtered_log.readlines() > for line in filtered_line: > token = line.split(" ") > print token[0], token[1], token[5], token[13], token[20] > print "Done" > [snip] If you don't need the power of regex, it's faster to use string methods: filter0 = line.replace("