Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73834

Re: Searching for lots of similar strings (filenames) in sqlite3 database

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.009
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'anyway.': 0.05; 'matches': 0.07; 'paths': 0.07; 'filename': 0.09; 'mind,': 0.09; 'rows': 0.09; 'cc:addr:python-list': 0.11; "?',": 0.16; 'adam': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'magic': 0.16; 'subdir': 0.16; 'subject:sqlite3': 0.16; 'underscores,': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'code,': 0.22; 'cc:addr:python.org': 0.22; 'instance,': 0.24; 'tells': 0.24; 'cc:2**0': 0.24; "i've": 0.25; 'header:In-Reply-To:1': 0.27; 'feature': 0.29; 'character': 0.29; 'subject:) ': 0.29; 'message- id:@mail.gmail.com': 0.30; 'flags': 0.31; 'though.': 0.31; 'bugs': 0.33; 'problem': 0.35; 'subject: (': 0.35; 'case,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'false': 0.36; 'like,': 0.36; 'so,': 0.37; 'files': 0.38; 'pm,': 0.38; 'little': 0.38; 'anything': 0.39; 'itself': 0.39; 'changed': 0.39; 'catch': 0.60; 'most': 0.60; 'eye': 0.61; 'happen': 0.63; 'moments': 0.68; 'containing': 0.69; 'jul': 0.74; 'presumably': 0.84; 'to:none': 0.92; 'serious': 0.97
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=pEvYdjQFoiTh2qhEvtgSz+XMrT/aDthYvyNaCm3KHB4=; b=us+wNw9Xa6Y69bEfy45mNlZUYwZ+2HUygQF8YgTzSJXwdGQVcBlgxbuqMBvJCHmc9j yH2yiCS3T243ar/u3kRWFU+hi0laJcmeNvh3Zeu/ZRFOfD0rPfnjYnnkg9uAXgq845cu T5vKbmHUd6Im4BJq0RTTEPXdfWRMNiiBKrm+NZgdXdpGZoXLkK+zuD9wROr+rNnCgAE9 6ocfm8VNJMH1sCk8+LdIvmfi0tk2bg4jrEXtsSrEuQi0gHdRQRSeSwUsh92z0v9UXNep Ehmrmk8qbQTN1g7qapzpj1nrATw0NBR9xXty5JCiJl6gQGiEA1gqOrEzexGp8VoW5vEF qk+Q==
MIME-Version 1.0
X-Received by 10.220.30.69 with SMTP id t5mr41827836vcc.6.1404295392291; Wed, 02 Jul 2014 03:03:12 -0700 (PDT)
In-Reply-To <770d8bxunj.ln2@news.ducksburg.com>
References <tgia8bx2sf.ln2@news.ducksburg.com> <mailman.11362.1404216365.18130.python-list@python.org> <3uva8bx3dn.ln2@news.ducksburg.com> <mailman.11375.1404230249.18130.python-list@python.org> <770d8bxunj.ln2@news.ducksburg.com>
Date Wed, 2 Jul 2014 20:03:12 +1000
Subject Re: Searching for lots of similar strings (filenames) in sqlite3 database
From Chris Angelico <rosuav@gmail.com>
Cc "python-list@python.org" <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.11407.1404295399.18130.python-list@python.org> (permalink)
Lines 27
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1404295399 news.xs4all.nl 2949 [2001:888:2000:d::a6]:40025
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:73834

Show key headers only | View raw


On Wed, Jul 2, 2014 at 7:32 PM, Adam Funk <a24061@ducksburg.com> wrote:
> Well, I've changed it to the following anyway.
>
>         subdir_glob = subdir + '/*'
>         cursor.execute('SELECT filename FROM files WHERE filename GLOB ?',
>                        (subdir_glob,))
>         rows = cursor.fetchall()
>         known_files = {row[0] for row in rows}
>
> I see what you mean about paths containing '%', but I don't see why
> you were concerned about underscores, though.

With GLOB, presumably ? matches a single character and * matches any
number of characters. With LIKE, _ matches a single character and %
matches any number. So, for instance, WHERE filename LIKE
'/foo/bar/spam_spam/%' will match '/foo/bar/spam2spam/1234', which may
be a little surprising. It's not going to be a serious problem in most
cases, as it'll also match '/foo/bar/spam_spam/1234', but the false
positives will make one of those "Huhhhhh????" moments if you don't
keep an eye on your magic characters.

In your specific case, you happen to be safe, but as I look over the
code, my paranoia kicks in and tells me to check :) It's just one of
those things that flags itself to the mind - anything that might help
catch bugs early is a good feature of the mind, in my opinion!

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Searching for lots of similar strings (filenames) in sqlite3 database Adam Funk <a24061@ducksburg.com> - 2014-07-01 12:26 +0100
  Re: Searching for lots of similar strings (filenames) in sqlite3 database Chris Angelico <rosuav@gmail.com> - 2014-07-01 22:06 +1000
    Re: Searching for lots of similar strings (filenames) in sqlite3 database Adam Funk <a24061@ducksburg.com> - 2014-07-01 16:15 +0100
      Re: Searching for lots of similar strings (filenames) in sqlite3 database Chris Angelico <rosuav@gmail.com> - 2014-07-02 01:57 +1000
        Re: Searching for lots of similar strings (filenames) in sqlite3 database Adam Funk <a24061@ducksburg.com> - 2014-07-02 10:30 +0100
        Re: Searching for lots of similar strings (filenames) in sqlite3 database Adam Funk <a24061@ducksburg.com> - 2014-07-02 10:32 +0100
          Re: Searching for lots of similar strings (filenames) in sqlite3 database Chris Angelico <rosuav@gmail.com> - 2014-07-02 20:03 +1000
            Re: Searching for lots of similar strings (filenames) in sqlite3 database Adam Funk <a24061@ducksburg.com> - 2014-07-02 13:30 +0100
  Re: Searching for lots of similar strings (filenames) in sqlite3 database MRAB <python@mrabarnett.plus.com> - 2014-07-01 13:13 +0100
  Re: Searching for lots of similar strings (filenames) in sqlite3 database Chris Angelico <rosuav@gmail.com> - 2014-07-02 00:02 +1000

csiph-web