Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.008 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'url:pypi': 0.03; 'received:openend.se': 0.09; 'received:theraft.openend.se': 0.09; 'cc:addr:python-list': 0.10; 'cc:addr:lac': 0.16; 'cc:addr:openend.se': 0.16; 'from:addr:lac': 0.16; 'from:addr:openend.se': 0.16; 'from:name:laura creighton': 0.16; 'message-id:@fido.openend.se': 0.16; 'optimised': 0.16; 'received:89.233': 0.16; 'received:89.233.217': 0.16; 'received:89.233.217.133': 0.16; 'received:fido': 0.16; 'received:fido.openend.se': 0.16; 'laura': 0.18; 'shell': 0.18; 'cc:addr:python.org': 0.21; 'cc:2**1': 0.22; 'do.': 0.22; 'see:': 0.22; 'matching': 0.23; 'header:In-Reply-To:1': 0.24; 'sort': 0.25; 'regular': 0.29; 'received:se': 0.29; 'structure,': 0.29; 'url:se': 0.29; 'cc:no real name:2**1': 0.29; 'posts': 0.31; 'task': 0.31; 'code': 0.31; 'url:python': 0.33; 'surprised': 0.33; 'but': 0.36; 'url:org': 0.36; 'there': 0.36; 'faster': 0.36; 'subject:: ': 0.37; 'charset:us-ascii': 0.37; 'version': 0.38; 'files': 0.38; 'build': 0.40; 'your': 0.60; 'header:Message-Id:1': 0.62; 'here:': 0.62; 'results': 0.66; 'beat': 0.66; 'matthew': 0.66; 'received:89': 0.80; 'find.': 0.84; 'investigated': 0.84; 'thing,': 0.93 To: Cecil Westerhof cc: python-list@python.org, lac@openend.se From: Laura Creighton Subject: Re: Find in ipython3 In-Reply-To: Message from Cecil Westerhof of "Sat, 06 Jun 2015 11:57:54 +0200." <874mmlqhul.fsf@Equus.decebal.nl> References: <87y4k2hyvf.fsf@Equus.decebal.nl> <87bnguhbec.fsf@Equus.decebal.nl><874mmlqhul.fsf@Equus.decebal.nl> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <32683.1433588875.1@fido> Date: Sat, 06 Jun 2015 13:07:55 +0200 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.9 (theraft.openend.se [89.233.217.130]); Sat, 06 Jun 2015 13:08:02 +0200 (CEST) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 21 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1433588892 news.xs4all.nl 2869 [2001:888:2000:d::a6]:45607 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:92180 The !find version is C code optimised to do one thing, find files in your directory structure, which happens to be what you want to do. General regular expression matching is harder. Carl Friedrich Bolz investigated regular expression algorithms and their implementation to see if this is the sort of task that a JIT can improve. He blogged about it in 2 posts (part1 and part2). There are benchmarks for part2. Benchmarks in part2. see: http://morepypy.blogspot.se/2010/05/efficient-and-elegant-regular.html http://morepypy.blogspot.se/2010/06/jit-for-regular-expression-matching.html You may get faster results if you use Matthew Barnett's replacement for re here: https://pypi.python.org/pypi/regex You will get faster results if you build your IPython shell to use PyPy, but I would still be very surprised if it beat the C program find. Laura