Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.015 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'suppose': 0.05; 'dictionary': 0.07; 'skip:` 10': 0.07; 'received:64.4': 0.09; 'subject:string': 0.09; 'def': 0.12; 'meaningful': 0.14; '-1:': 0.16; 'hans': 0.16; 'res': 0.16; 'method.': 0.16; 'sections': 0.16; 'header:In-Reply-To:1': 0.21; 'keys': 0.23; 'skip:k 20': 0.23; 'code': 0.24; 'function': 0.25; 'tests': 0.26; 'later': 0.26; 'string': 0.26; 'times.': 0.26; 'correct': 0.28; 'module': 0.30; 'subject:format': 0.30; 'cheers': 0.32; 'to:addr:python- list': 0.33; 'break': 0.33; 'list': 0.33; 'file': 0.34; 'header :User-Agent:1': 0.35; 'using': 0.35; 'charset:us-ascii': 0.36; 'uses': 0.36; 'think': 0.38; 'subject:from': 0.38; 'run': 0.38; 'but': 0.38; 'subject:: ': 0.38; 'some': 0.38; 'johnson': 0.39; 'to:addr:python.org': 0.39; 'format': 0.40; 'results': 0.60; 'more': 0.60; 'your': 0.60; 'here': 0.66; 'today,': 0.71; 'promise': 0.72; ':).': 0.84; 'start)': 0.84 Date: Mon, 20 Jun 2011 14:39:06 -0800 From: Tim Johnson To: python-list@python.org Subject: Re: Parsing a dictionary from a format string References: <4dffa8b9$0$49179$e4fe514c@news.xs4all.nl> <20110620204916.GL1971@johnsons-web.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110620204916.GL1971@johnsons-web.com> Organization: AkWebsoft User-Agent: Mutt/1.5.20 (2009-06-14) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 63 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1308609533 news.xs4all.nl 49183 [::ffff:82.94.164.166]:49278 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:8036 * Tim Johnson [110620 13:00]: > > I think later today, I will run some time tests using the `re' > module as well as your function and the one above. OK: Functions follow: def grabBetween(src,begin,end): """Grabs sections of text between `begin' and `end' and returns a list of 0 or more sections of text.""" parts = src.split(begin) res = [] for part in parts: L = part.split(end) if len(L) > 1: res.append(L[0]) return res def splitExtractDict(src,default): """Extract dictionary keys for a format string using `grabBetween', which uses the `split' string method.""" D = {} keys = grabBetween(src,'{','}') for k in keys : D[k] = default return D def reExtractDict(src,default): """Extract dictionary keys for a format string using `re'""" D = {} keys = re.findall(r'\{([^}]*)\}', src) for k in keys : D[k] = default return D ## From Hans Mulder def findExtractDict(src,default): start = -1 keys,D = [],{} while True: start = src.find('{', start+1) if start == -1: break end = src.find('}', start) if end > start: keys.append(src[start+1:end]) for k in keys : D[k] = default return D ################################################### Now here are results using a small file and a lot of reps for each function call, just to give some meaningful times. ################################################### Using `split' : 0.0309112071991 Using `re.find' : 0.0205819606781 Using `find' : 0.0296318531036 I will note that the last method did not produce correct results, but I also note that Hans did not promise tested code :). It is reasonable to suppose the `re' provides the faster method. cheers -- Tim tim at johnsons-web dot com or akwebsoft dot com http://www.akwebsoft.com