Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #35266
| Path | csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <d@davea.name> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'value,': 0.03; 'argument': 0.04; 'string.': 0.04; 'true,': 0.04; 'attribute': 0.05; 'binary': 0.05; 'class,': 0.07; 'executed': 0.07; 'false.': 0.07; 'function,': 0.07; 'subject:code': 0.07; 'works.': 0.07; 'subject:help': 0.07; 'python': 0.09; 'dict': 0.09; 'exists,': 0.09; 'keyed': 0.09; 'logic': 0.09; 'loop.': 0.09; 'notation.': 0.09; 'rows': 0.09; 'suggestions.': 0.09; 'bisect': 0.16; 'copying.': 0.16; 'loops': 0.16; 'parts.': 0.16; 'subject:making': 0.16; 'tool.': 0.16; 'top-level': 0.16; 'tup': 0.16; 'string': 0.17; 'wrote:': 0.17; 'instance': 0.17; 'items.': 0.17; '>>>': 0.18; 'code,': 0.18; 'memory': 0.18; 'thanks.': 0.21; 'context.': 0.22; 'fixing': 0.22; 'modifying': 0.22; 'object.': 0.22; 'split': 0.23; 'statement': 0.23; 'seems': 0.23; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'separate': 0.27; 'convention': 0.27; 'module.': 0.27; 'object,': 0.27; 'all.': 0.28; 'comparison': 0.29; 'dictionary': 0.29; 'equivalent.': 0.29; 'long.': 0.29; 'loop,': 0.29; 'way?': 0.29; 'case,': 0.29; 'no,': 0.29; 'probably': 0.29; "i'm": 0.29; "we're": 0.30; 'that.': 0.30; 'thursday,': 0.30; 'function': 0.30; 'sense': 0.31; 'lists': 0.31; 'code': 0.31; 'gets': 0.32; 'december': 0.32; 'generally': 0.32; 'could': 0.32; 'function.': 0.33; 'to:addr:python-list': 0.33; 'point.': 0.33; 'times.': 0.33; 'skip:d 20': 0.34; "can't": 0.34; 'self': 0.34; 'list': 0.35; 'pm,': 0.35; 'there': 0.35; 'list.': 0.35; 'really': 0.36; 'tool': 0.36; 'but': 0.36; 'anything': 0.36; 'subject:with': 0.36; 'should': 0.36; 'too': 0.36; 'does': 0.37; 'being': 0.37; 'item': 0.37; 'far': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'fact': 0.38; 'mean': 0.38; 'some': 0.38; 'instead': 0.39; 'performance': 0.39; 'to:addr:python.org': 0.39; 'takes': 0.39; 'received:192': 0.39; 'called': 0.39; 'where': 0.40; 'received:192.168': 0.40; 'end': 0.40; 'think': 0.40; 'your': 0.60; 'remove': 0.61; 'identify': 0.61; 'map': 0.61; 'first': 0.61; "you'll": 0.62; 'interest': 0.62; 'email addr:gmail.com': 0.63; 'more': 0.63; 'making': 0.64; '20,': 0.65; 'stuck': 0.65; 'want,': 0.65; 'header:Reply-To:1': 0.68; 'cut': 0.71; 'received:74.208': 0.71; 'reply-to:no real name:2**0': 0.72; 'column.': 0.84; 'dict,': 0.84; 'judicious': 0.84; 'moot': 0.84; 'nearby': 0.84; 'quicker': 0.84; 'received:74.208.4.194': 0.84; 'ref': 0.84; '360': 0.91; 'fragment': 0.91; 'items,': 0.91; 'angel': 0.93 |
| Date | Thu, 20 Dec 2012 22:31:18 -0500 |
| From | Dave Angel <d@davea.name> |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121011 Thunderbird/16.0.1 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: help with making my code more efficient |
| References | <bd5ae6b7-2440-42e4-a93c-eb877feebcfe@googlegroups.com> <mailman.1126.1356052648.29569.python-list@python.org> <d6aaa5b5-7d21-4018-ba9a-ea354b15b6c5@googlegroups.com> |
| In-Reply-To | <d6aaa5b5-7d21-4018-ba9a-ea354b15b6c5@googlegroups.com> |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Content-Transfer-Encoding | 7bit |
| X-Provags-ID | V02:K0:LWWf8R2uk3fQzG3cZ5mZvX2CICWiZgbXmFyC4Q7yTlj PoQe1jboh52AagPSewc3W4ACiGfgh+bzC1uLw+1JLiFVnWruMl 4Kqp/LyDXk4RnzWOrLzf9HEjoWM4cMe1awwh+HyWuFIGfKaXiB GgG2VMBJGw0HpMgnOl60VkzJaDE5kp8KvFD4eZ4WNgJIGnuUbp m4r8z81l2DatIqVJTm3W5HTuKkrcN7y7sbVfkryB5wwVZ8aaZA 65sG6ExDSqKJtsTulKO0LJ/sRZMZ6z9A1fblJt72F4C77dZPOi Mxg+xtaaHiu//OfuU0hjDFce3lenjjmf2K1w9hwVupmjaaSmw= = |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| Reply-To | d@davea.name |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.1134.1356060700.29569.python-list@python.org> (permalink) |
| Lines | 80 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1356060700 news.xs4all.nl 6896 [2001:888:2000:d::a6]:54983 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:35266 |
Show key headers only | View raw
On 12/20/2012 08:46 PM, Larry.Martell@gmail.com wrote: > On Thursday, December 20, 2012 6:17:04 PM UTC-7, Dave Angel wrote: >> <snip> > Of course it's a fragment - it's part of a large program and I was just showing the relevant parts. But it seems these are methods in a class, or something, so we're missing context. And you use self without it being an argument to the function. Like it's a global. > <snip> > Yes, the code works. I end up with just the rows I want. >> Are you only concerned about speed, not fixing features? > Don't know what you mean by 'fixing features'. The code does what I want, it just takes too long. > >> As far as I can tell, the logic that includes the time comparison is bogus. > Not at all. > >> You don't do anything there to worry about the value of tup[2], just whether some >> item has a nearby time. Of course, I could misunderstand the spec. > The data comes from a database. tup[2] is a datetime column. tdiff comes from a datetime.timedelta() I thought that tup[1] was the datetime. In any case, the loop makes no sense to me, so I can't really optimize it, just make suggestions. > >> Are you making a global called 'self' ? That name is by convention only >> used in methods to designate the instance object. What's the attribute >> self? > Yes, self is my instance object. self.message contains the string of interest that I need to look for. > >> Can cdata have duplicates, and are they significant? > No, it will not have duplicates. > >> Is the list sorted in any way? > Yes, the list is sorted by tool and datetime. > >> Chances are your performance bottleneck is the doubly-nested loop. You >> have a list comprehension at top-level code, and inside it calls a >> function that also loops over the 600,000 items. So the inner loop gets >> executed 360 billion times. You can cut this down drastically by some >> judicious sorting, as well as by having a map of lists, where the map is >> keyed by the tool. > Thanks. I will try that. So in your first loop, you could simply split the list into separate lists, one per tup[0] value, and the lists as dictionary items, keyed by that tool string. Then inside the determine() function, make a local ref to the particular list for the tool. recs = messageTimes[tup[0]] Instead of a for loop over recs, use a binary search to identify the first item that's >= date_time-tdiff. Then if it's less than date_time+tdiff, return True, otherwise False. Check out the bisect module. Function bisect_left() should do what you want in a sorted list. >>> cdata[:] = [tup for tup in cdata if determine(tup)] >> >> >> As the code exists, there's no need to copy the list. Just do a simple >> bind. > This statement is to remove the items from cdata that I don't want. I don't know what you mean by bind. I'm not familiar with that python function. Every "assignment" to a simple name is really a rebinding of that name. cdata = [tup for tup in cdata if determine(tup)] will rebind the name to the new object, much quicker than copying. If this is indeed a top-level line, it should be equivalent. But if in fact this is inside some other function, it may violate some other assumptions. In particular, if there are other names for the same object, then you're probably stuck with modifying it in place, using slice notation. BTW, a set is generally much more memory efficient than a dict, when you don't use the "value". But since I think you'll be better off with a dict of lists, it's a moot point. -- DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-20 16:19 -0800
Re: help with making my code more efficient Chris Angelico <rosuav@gmail.com> - 2012-12-21 11:38 +1100
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-20 16:43 -0800
Re: help with making my code more efficient Chris Angelico <rosuav@gmail.com> - 2012-12-21 13:56 +1100
Re: help with making my code more efficient Roy Smith <roy@panix.com> - 2012-12-20 22:30 -0500
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-20 16:43 -0800
Re: help with making my code more efficient Dave Angel <d@davea.name> - 2012-12-20 20:17 -0500
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-20 17:46 -0800
Re: help with making my code more efficient Mitya Sirenef <msirenef@lightbird.net> - 2012-12-20 21:39 -0500
Re: help with making my code more efficient Mitya Sirenef <msirenef@lightbird.net> - 2012-12-20 21:49 -0500
Re: help with making my code more efficient Dave Angel <d@davea.name> - 2012-12-20 22:31 -0500
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 09:57 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 09:57 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 12:36 -0800
Re: help with making my code more efficient Dave Angel <d@davea.name> - 2012-12-21 22:19 -0500
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 20:47 -0800
Re: help with making my code more efficient Dave Angel <d@davea.name> - 2012-12-22 01:47 -0500
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-24 09:57 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-24 09:57 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 20:47 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-21 12:36 -0800
Re: help with making my code more efficient "Larry.Martell@gmail.com" <Larry.Martell@gmail.com> - 2012-12-20 17:46 -0800
Re: help with making my code more efficient MRAB <python@mrabarnett.plus.com> - 2012-12-21 02:08 +0000
csiph-web