Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Newsgroups: comp.lang.python
Date: Thu, 20 Dec 2012 16:43:34 -0800 (PST)
In-Reply-To: <mailman.1123.1356050292.29569.python-list@python.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=68.84.146.219; posting-account=aFD2wgkAAACT3OnBYoNKQGBzyOZ_PB2h
References: <bd5ae6b7-2440-42e4-a93c-eb877feebcfe@googlegroups.com> <mailman.1123.1356050292.29569.python-list@python.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Subject: Re: help with making my code more efficient
From: "Larry.Martell@gmail.com" <Larry.Martell@gmail.com>
To: comp.lang.python@googlegroups.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: python-list@python.org
Precedence: list
Message-ID: <mailman.1127.1356053559.29569.python-list@python.org>
Lines: 30
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:35254

On Thursday, December 20, 2012 5:38:03 PM UTC-7, Chris Angelico wrote:
> On Fri, Dec 21, 2012 at 11:19 AM, Larry.Martell@gmail.com
>=20
> <Larry.Martell@gmail.com> wrote:
>=20
> > This code works, but it takes way too long to run - e.g. when cdata has=
 600,000 elements (which is typical for my app) it takes 2 hours for this t=
o run.
>=20
> >
>=20
> > Can anyone give me some suggestions on speeding this up?
>=20
> >
>=20
>=20
>=20
> It sounds like you may have enough data to want to not keep it all in
>=20
> memory. Have you considered switching to a database? You could then
>=20
> execute SQL queries against it.

It came from a database. Originally I was getting just the data I wanted us=
ing SQL, but that was taking too long also. I was selecting just the messag=
es I wanted, then for each one of those doing another query to get the data=
 within the time diff of each. That was resulting in tens of thousands of q=
ueries. So I changed it to pull all the potential matches at once and then =
process it in python.=20