Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.082 X-Spam-Evidence: '*H*': 0.85; '*S*': 0.02; 'subject:Python': 0.05; 'thats': 0.07; 'enormous': 0.09; 'preferable': 0.09; 'rows': 0.09; 'timestamp': 0.09; 'tuple': 0.09; 'index': 0.13; 'count,': 0.16; 'row': 0.16; 'scratch': 0.16; 'sqlite3': 0.16; 'tuples,': 0.16; 'wrote:': 0.17; 'color,': 0.22; 'cheers,': 0.23; 'tried': 0.25; 'header:In-Reply-To:1': 0.25; 'am,': 0.27; 'environment.': 0.27; 'message-id:@mail.gmail.com': 0.27; 'key,': 0.29; '(from': 0.30; 'fri,': 0.30; 'query': 0.30; 'primary': 0.30; 'structure': 0.32; 'running': 0.32; 'to:addr:python-list': 0.33; 'another': 0.33; 'received:google.com': 0.34; 'received:209.85': 0.35; 'there': 0.35; 'test': 0.36; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'several': 0.39; 'to:addr:python.org': 0.39; 'where': 0.40; 'header:Received:5': 0.40; 'think': 0.40; 'skip:u 10': 0.60; 'between': 0.63; 'times': 0.63; 'six': 0.65; '20,': 0.65; 'jul': 0.65; 'million': 0.72; 'subject:this': 0.84; 'column.': 0.84; 'subject:handle': 0.84; 'to:name:python': 0.84; 'faster.': 0.91; 'minute.': 0.91; 'rita': 0.91; 'instantly': 0.93; 'hundred': 0.95; 'tough': 0.97 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=x5N72hjrAc4ulmUaIdfvoVkdejAcUu4v2i/JUG9pjAI=; b=MyA/69+hGgl/ePIgQmxnnQIYS1mN0QExYo37tLwmccQDzDz9JlOxGQrAHlQw+AaIi/ tBUjA/SawCWw9SEIAj8Z+LRNyYe3au8tVcrKATOjMRwKsqrNzLH5bxSvMydvyUHERQBR mA5sAs8VYrc7rZKOe9JNZu/KMMEGi8fSYI9wBkHQoMfEPqjFwEsk37RBDfLf8UEC6HAT SuXxYL7BEZvx8B8666RjHxmw9L20IBP1CKURR/GyUbz/xGdUmWFxm9SBGmb1/LCNSy1W dr5cZivWALK/Jc43npYYsdbyZcwudG5NAtvjZXeep92P8Rw96J6Kmo56h3BqvKM0iykM v0hQ== MIME-Version: 1.0 In-Reply-To: References: <5008ABD5.8020407@davea.name> <6emh0859cren5ond0k5n2f58mh36bnp9jc@invalid.netcom.com> From: Ian Kelly Date: Fri, 20 Jul 2012 12:14:30 -0600 Subject: Re: best way to handle this in Python To: Python Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 17 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1342808103 news.xs4all.nl 6882 [2001:888:2000:d::a6]:57115 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:25700 On Fri, Jul 20, 2012 at 4:34 AM, Rita wrote: > Thats an interesting data structure Dennis. I will actually be running this > type of query many times preferable in an ad-hoc environment. That makes it > tough for sqlite3 since there will be several hundred thousand tuples. Several hundred thousand is not an enormous number. I think you're underestimating sqlite3. I just tried a test with one million tuples, six colors per tuple (six million rows altogether). Each row contains a primary key, a timestamp, a color, and a count, with an index on the timestamp column. Building the database from scratch took about a minute; adding the index took about another minute. Incremental updates would of course be much faster. Queries like "select * from data where timestamp between 500000 and 600000" return instantly (from a user perspective). Cheers, Ian