Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #60848 > unrolled thread

Re: Checking Common File Types

Started byDennis Lee Bieber <wlfraed@ix.netcom.com>
First post2013-12-01 18:23 -0500
Last post2013-12-01 18:23 -0500
Articles 1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Checking Common File Types Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-12-01 18:23 -0500

#60848 — Re: Checking Common File Types

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2013-12-01 18:23 -0500
SubjectRe: Checking Common File Types
Message-ID<mailman.3451.1385940214.18130.python-list@python.org>
On Sun, 1 Dec 2013 18:27:16 +0000, jade <jadec375@msn.com> declaimed the
following:

>Hello, 
>I'm trying to create a script that checks all the files in my 'downloaded' directory against common file types and then tells me how many of the files in that directory aren't either a GIF or a JPG file. I'm familiar with basic Python but this is the first time I've attempted anything like this and I'm looking for a little help or a point in the right direction? 
>
>file_sigs = {'\xFF\xD8\xFF':('JPEG','jpg'),  '\x47\x49\x46':('GIF','gif')}

	Apparently you presume the file extensions are inaccurate, as you are
digging into the files for signatures.

>def readFile():    filename = r'c:/temp/downloads'      fh = open(filename, 'r')     file_sig = fh.read(4) print '[*] check_sig() File:',filename #, 'Hash Sig:', binascii.hexlify(file_sig) 

	Note: if you are hardcoding forward slashes, you don't need the raw
indicator...

	That said, what is "c:/temp/downloads"? You apparently are opening IT
as the file to be examined. Is it supposed to be a directory containing
many files, a file containing a list of files, ???

	What is "check_sig" -- it looks like a function you haven't defined --
but it's inside the quotes making a string literal that will never be
called anyway.

	If you are just concerned with one directory of files, you might want
to read the help file on the glob module, along with os.path
(join/splitext/etc). Or just string methods...

>>> import glob
>>> import os.path
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... 	"documents/BW-conversion/*")
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... 	"documents/BW-conversion/*")
>>> files = glob.glob(TARGET)
>>> for fn in files:
... 	fp, fx = os.path.splitext(fn)
... 	print "File %s purports to be of type %s" % (fn, fx.upper())
... 
File C:\Users\Wulfraed\documents/BW-conversion\BW-1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BWConv.html purports to be
of type .HTML
File C:\Users\Wulfraed\documents/BW-conversion\roo_b1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b5.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b6.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_col.jpg purports to be
of type .JPG
>>> 
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web