Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Ethan Furman Newsgroups: comp.lang.python Subject: Re: sobering observation, python vs. perl Date: Thu, 17 Mar 2016 10:26:12 -0700 Lines: 57 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de UAbEIsTww6/HCtnAVmpLHAYwAemEEd/lUYZU1PUvyqWg== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.007 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'sys': 0.05; 'line:': 0.07; 'mode:': 0.07; 'from:addr:ethan': 0.09; 'from:addr:stoneleaf.us': 0.09; 'from:name:ethan furman': 0.09; 'message-id:@stoneleaf.us': 0.09; 'mode,': 0.09; 'res': 0.09; 'python': 0.10; 'subject:python': 0.14; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'python"': 0.22; 'am,': 0.23; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'script': 0.25; 'header:User- Agent:1': 0.26; 'see,': 0.27; "skip:' 10": 0.28; 'mode.': 0.29; '~ethan~': 0.29; "i'm": 0.30; 'print': 0.30; 'code': 0.30; 'noticed': 0.32; 'point': 0.33; 'hopefully': 0.33; "skip:' 20": 0.34; 'skip:- 50': 0.35; 'could': 0.35; 'something': 0.35; 'lines': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'thought': 0.37; 'no,': 0.38; 'end': 0.39; 'to:addr:python.org': 0.40; 'your': 0.60; 'real': 0.62; 'charset:windows-1252': 0.62; 'times': 0.63; 'subject:. ': 0.67; 'useful.': 0.72; 'smith': 0.76; 'as:': 0.79; 'hammer': 0.84; 'impact.': 0.91 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:105124 On 03/17/2016 09:36 AM, Charles T. Smith wrote: > Yes, your point was to forgo REs despite that they are useful. > I could have thought the search would have been better as: > > 'release[-.:][Rr]eq' > > or something else ... you're in a "defend python at all costs!" mode. No, I'm in the "don't try to write in Python" mode, and "don't use 10lb sledge when 6oz hammer will do" mode: -------------------------------------------------------- # using `in` and printing line as each is found real 0m1.703s user 0m0.184s sys 0m0.260s # using `in` and printing lines at the end real 0m0.217s user 0m0.112s sys 0m0.068s # using 're' and printing lines at the end real 0m0.608s user 0m0.516s sys 0m0.060s -------------------------------------------------------- As you can see, how you print has a huge impact. Hopefully you also noticed that using `re` when `in` would do made the script 3 times slower. -------------------------------------------------------- # using `in` code import sys found = [] for fn in sys.argv[1:]: with open(fn) as fh: for line in fh: if 'timezone' in line: found.append(line) print ''.join(found) -------------------------------------------------------- # using `re` code import sys import re found = [] for fn in sys.argv[1:]: with open(fn) as fh: for line in fh: if re.search('timezone', line): found.append(line) print ''.join(found) -------------------------------------------------------- -- ~Ethan~