Groups > comp.lang.python > #25700 > unrolled thread

Re: best way to handle this in Python

Started by	Ian Kelly <ian.g.kelly@gmail.com>
First post	2012-07-20 12:14 -0600
Last post	2012-07-21 00:43 +0000
Articles	2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: best way to handle this in Python Ian Kelly <ian.g.kelly@gmail.com> - 2012-07-20 12:14 -0600
    Re: best way to handle this in Python Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-21 00:43 +0000

#25700 — Re: best way to handle this in Python

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2012-07-20 12:14 -0600
Subject	Re: best way to handle this in Python
Message-ID	<mailman.2351.1342808103.4697.python-list@python.org>

On Fri, Jul 20, 2012 at 4:34 AM, Rita <rmorgan466@gmail.com> wrote:
> Thats an interesting data structure Dennis. I will actually be running this
> type of query many times preferable in an ad-hoc environment. That makes it
> tough for sqlite3 since there will be several hundred thousand tuples.

Several hundred thousand is not an enormous number.  I think you're
underestimating sqlite3.  I just tried a test with one million tuples,
six colors per tuple (six million rows altogether).  Each row contains
a primary key, a timestamp, a color, and a count, with an index on the
timestamp column.  Building the database from scratch took about a
minute; adding the index took about another minute.  Incremental
updates would of course be much faster.  Queries like "select * from
data where timestamp between 500000 and 600000" return instantly (from
a user perspective).

Cheers,
Ian

[toc] | [next] | [standalone]

#25708

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2012-07-21 00:43 +0000
Message-ID	<5009fb4e$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to	#25700

On Fri, 20 Jul 2012 12:14:30 -0600, Ian Kelly wrote:

> On Fri, Jul 20, 2012 at 4:34 AM, Rita <rmorgan466@gmail.com> wrote:
>> Thats an interesting data structure Dennis. I will actually be running
>> this type of query many times preferable in an ad-hoc environment. That
>> makes it tough for sqlite3 since there will be several hundred thousand
>> tuples.
> 
> Several hundred thousand is not an enormous number.  I think you're
> underestimating sqlite3.


A common trap, and not just for sqlite. I frequently have to remind 
people -- including myself -- that what is a lot of data for you may not 
be a lot of data for your computer.




-- 
Steven

[toc] | [prev] | [standalone]

csiph-web

Re: best way to handle this in Python

Contents

#25700 — Re: best way to handle this in Python

#25708