Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #60848

Re: Checking Common File Types

Path csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'anyway.': 0.05; 'string': 0.09; '%s"': 0.09; 'filename': 0.09; 'jpg': 0.09; 'literal': 0.09; 'os.path': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'python': 0.11; '"file': 0.16; 'attempted': 0.16; 'direction?': 0.16; 'files:': 0.16; 'message-id:@4ax.com': 0.16; 'presume': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'skip:{ 30': 0.16; 'subject:File': 0.16; 'extensions': 0.16; 'trying': 0.19; '>>>': 0.22; 'import': 0.22; 'print': 0.22; "aren't": 0.24; 'module,': 0.24; 'tells': 0.24; 'url:home': 0.24; 'file.': 0.24; "haven't": 0.24; 'looks': 0.24; "i've": 0.25; 'script': 0.25; 'defined': 0.27; 'skip:" 20': 0.27; 'header:X -Complaints-To:1': 0.27; 'point': 0.28; 'function': 0.29; '???': 0.30; 'dec': 0.30; 'said,': 0.30; 'skip:( 20': 0.30; 'along': 0.30; "i'm": 0.30; 'apparently': 0.31; 'quotes': 0.31; 'file': 0.32; 'supposed': 0.32; 'raw': 0.33; 'basic': 0.35; 'common': 0.35; 'but': 0.35; 'charset:us-ascii': 0.36; 'list': 0.37; 'skip:o 20': 0.38; 'checks': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'little': 0.38; 'anything': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'received:org': 0.40; 'called': 0.40; 'how': 0.40; 'read': 0.60; 'skip:c 50': 0.60; 'skip:o 30': 0.61; 'first': 0.61; 'making': 0.63; 'forward': 0.65; 'note:': 0.66; 'containing': 0.69; 'received:108': 0.93; '2013': 0.98
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Dennis Lee Bieber <wlfraed@ix.netcom.com>
Subject Re: Checking Common File Types
Date Sun, 01 Dec 2013 18:23:22 -0500
Organization IISS Elusive Unicorn
References <DUB114-W56EEA08576BBC592BFF8029DEB0@phx.gbl>
Mime-Version 1.0
Content-Type text/plain; charset=us-ascii
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host adsl-108-68-177-200.dsl.klmzmi.sbcglobal.net
X-Newsreader Forte Agent 6.00/32.1186
X-No-Archive YES
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3451.1385940214.18130.python-list@python.org> (permalink)
Lines 68
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1385940214 news.xs4all.nl 15993 [2001:888:2000:d::a6]:38385
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:60848

Show key headers only | View raw


On Sun, 1 Dec 2013 18:27:16 +0000, jade <jadec375@msn.com> declaimed the
following:

>Hello, 
>I'm trying to create a script that checks all the files in my 'downloaded' directory against common file types and then tells me how many of the files in that directory aren't either a GIF or a JPG file. I'm familiar with basic Python but this is the first time I've attempted anything like this and I'm looking for a little help or a point in the right direction? 
>
>file_sigs = {'\xFF\xD8\xFF':('JPEG','jpg'),  '\x47\x49\x46':('GIF','gif')}

	Apparently you presume the file extensions are inaccurate, as you are
digging into the files for signatures.

>def readFile():    filename = r'c:/temp/downloads'      fh = open(filename, 'r')     file_sig = fh.read(4) print '[*] check_sig() File:',filename #, 'Hash Sig:', binascii.hexlify(file_sig) 

	Note: if you are hardcoding forward slashes, you don't need the raw
indicator...

	That said, what is "c:/temp/downloads"? You apparently are opening IT
as the file to be examined. Is it supposed to be a directory containing
many files, a file containing a list of files, ???

	What is "check_sig" -- it looks like a function you haven't defined --
but it's inside the quotes making a string literal that will never be
called anyway.

	If you are just concerned with one directory of files, you might want
to read the help file on the glob module, along with os.path
(join/splitext/etc). Or just string methods...

>>> import glob
>>> import os.path
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... 	"documents/BW-conversion/*")
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... 	"documents/BW-conversion/*")
>>> files = glob.glob(TARGET)
>>> for fn in files:
... 	fp, fx = os.path.splitext(fn)
... 	print "File %s purports to be of type %s" % (fn, fx.upper())
... 
File C:\Users\Wulfraed\documents/BW-conversion\BW-1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BWConv.html purports to be
of type .HTML
File C:\Users\Wulfraed\documents/BW-conversion\roo_b1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b5.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b6.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_col.jpg purports to be
of type .JPG
>>> 
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Checking Common File Types Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-12-01 18:23 -0500

csiph-web