Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!novso.com!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'exists.': 0.07; 'string': 0.09; '22,': 0.09; 'filenames': 0.09; 'iterate': 0.09; 'returns,': 0.09; 'rows,': 0.09; 'strings.': 0.09; 'cc:addr:python-list': 0.11; 'jan': 0.12; 'assume': 0.14; 'correlation': 0.16; 'efficiency.': 0.16; 'filename,': 0.16; 'for,': 0.16; 'roy': 0.16; 'set,': 0.16; 'subject:Case': 0.16; 'subject:exists': 0.16; 'subject:insensitive': 0.16; 'traverse': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'separate': 0.22; 'cc:addr:python.org': 0.22; 'comparing': 0.24; 'string,': 0.24; 'cc:2**0': 0.24; 'compare': 0.26; 'query': 0.26; 'asking': 0.27; 'header:In-Reply-To:1': 0.27; 'converting': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'existence': 0.31; 'larry': 0.31; 'names.': 0.31; 'file': 0.32; 'run': 0.32; 'quite': 0.32; 'guess': 0.33; 'totally': 0.33; 'but': 0.35; 'received:google.com': 0.35; 'really': 0.36; "i'll": 0.36; 'e.g.': 0.38; 'others.': 0.38; 'files': 0.38; 'issue': 0.38; 'pm,': 0.38; 'sure': 0.39; 'lower': 0.61; 'full': 0.61; "you're": 0.61; 'back': 0.62; 'name': 0.63; 'different': 0.65; 'smith': 0.68; 'article': 0.77 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=zesvrGehj+rfbLohyYA/67F2iJ5xPoi3eKSl0XyfXko=; b=bf6cx37I1FQrbnZw7yBVjcLzpAwLkYm0Tf0iKsXrkUnh4nW118R/QaFZxuSg2ce6bX 0WCVsqpq7adyIxprh7kVeoef+32sJia0atCchuWgAdSSO1QTWXBXlCsp9mnmSYQkYJ4q 6IB1ZZGF3WgpOKTPz/aUtrIUelp6+hKSSfg0N9EK3c7t+0OEMyieV+8Mttn49I9Tmvbz svjiz4dq41ukCWsAc+dvivd+hhRiVRdhzQdBkUeZ3ZNQF1zdhnsLSXmFH1j6YjS0MHiW nziBcavYBVAYocXlhJ8jYoXQE6GdzWGKRsGPdGVlEw8SZQj7JgammpdmMzRjwku4NJ3b w/Sg== MIME-Version: 1.0 X-Received: by 10.220.145.75 with SMTP id c11mr3031307vcv.30.1390439912302; Wed, 22 Jan 2014 17:18:32 -0800 (PST) In-Reply-To: References: Date: Wed, 22 Jan 2014 18:18:32 -0700 Subject: Re: Case insensitive exists()? From: Larry Martell To: Roy Smith Content-Type: text/plain; charset=UTF-8 Cc: "python-list@python.org" X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1390439920 news.xs4all.nl 2912 [2001:888:2000:d::a6]:55796 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:64547 On Wed, Jan 22, 2014 at 6:08 PM, Roy Smith wrote: > In article , > Larry Martell wrote: > >> I have the need to check for a files existence against a string, but I >> need to do case-insensitively. I cannot efficiently get the name of >> every file in the dir and compare each with my string using lower(), >> as I have 100's of strings to check for, each in a different dir, and >> each dir can have 100's of files in it. > > I'm not quite sure what you're asking. Do you need to match the > filename, or find the string in the contents of the file? I'm going to > assume you're asking the former. Yes, match the file names. e.g. if my match string is "ABC" and there's a file named "Abc" then it would be match. > One way or another, you need to iterate over all the directories and get > all the filenames in each. The time to do that is going to totally > swamp any processing you do in terms of converting to lower case and > comparing to some set of strings. > > I would put all my strings into a set, then use os.walk() traverse the > directories and for each path os.walk() returns, do "path.lower() in > strings". The issue is that I run a database query and get back rows, each with a file path (each in a different dir). And I have to check to see if that file exists. Each is a separate search with no correlation to the others. I have the full path, so I guess I'll have to do dir name on it, then a listdir then compare each item with .lower with my string .lower. It's just that the dirs have 100's and 100's of files so I'm really worried about efficiency.