Path: csiph.com!usenet.pasdenom.info!news.franciliens.net!fdn.fr!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder7.xlned.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.009 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'mrab': 0.05; 'sql,': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'question.': 0.14; '"insert': 0.16; '10:13': 0.16; 'extension,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'subject:sqlite3': 0.16; 'wrote:': 0.18; 'cc:addr:python.org': 0.22; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'subject:) ': 0.29; 'compared': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'problem': 0.35; 'subject: (': 0.35; 'something': 0.35; 'operations': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'doubt': 0.36; 'question,': 0.38; 'pm,': 0.38; 'anything': 0.39; 'though,': 0.39; 'sure': 0.39; 'jul': 0.74; 'to:none': 0.92 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=0VCcJaQ9xauegCC41Xzdc6fh+me7fDTT+joXdgi2C9Y=; b=PA+iF1kjYbTCFU2kDuPlMGV6BiZs0fa/Zj4bSJg/v+NDLSEV4eLpQEakje08tBhAD5 X6XxSVy0eTicFev630WZ6dnjIa03NDGDqY+VEMFW6e4faseGeCTcUfUqPEAWCn5/ixQO 2JbUCkFLIzcdt58qQq9hanGtGbEOTjCfdTdkB7zOX+/oIVypH06nzYNu+S4wA7sATrLk WsvHXqcqM/Qwf1wUAw7vLjhwgpdFrEnCfrvJ+guIFugzJOkebdcSIohx/88Nujw6j19f yI68fN/B+oG5BJn3+JBGIn6IvSLcfzfgAfkO4rR/cygLaumITKyE8StXrYgSXJNzNJcL VS+g== MIME-Version: 1.0 X-Received: by 10.52.119.179 with SMTP id kv19mr37120161vdb.3.1404223371875; Tue, 01 Jul 2014 07:02:51 -0700 (PDT) In-Reply-To: <53B2A604.3040208@mrabarnett.plus.com> References: <53B2A604.3040208@mrabarnett.plus.com> Date: Wed, 2 Jul 2014 00:02:51 +1000 Subject: Re: Searching for lots of similar strings (filenames) in sqlite3 database From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 10 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1404223375 news.xs4all.nl 2876 [2001:888:2000:d::a6]:34348 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:73788 On Tue, Jul 1, 2014 at 10:13 PM, MRAB wrote: > Anyway, I'm sure there's something in SQL for "insert or update" or "on > duplicate", but that's an SQL question, not a Python question. Not in standard SQL, no; there might be in SQLite, as a non-standard extension, but it's a fundamentally hard problem and it has issues. Frankly, though, I doubt the time cost of set operations is anything significant compared to the various queries against the database. ChrisA