Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #60848 > unrolled thread
| Started by | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| First post | 2013-12-01 18:23 -0500 |
| Last post | 2013-12-01 18:23 -0500 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Checking Common File Types Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-12-01 18:23 -0500
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2013-12-01 18:23 -0500 |
| Subject | Re: Checking Common File Types |
| Message-ID | <mailman.3451.1385940214.18130.python-list@python.org> |
On Sun, 1 Dec 2013 18:27:16 +0000, jade <jadec375@msn.com> declaimed the
following:
>Hello,
>I'm trying to create a script that checks all the files in my 'downloaded' directory against common file types and then tells me how many of the files in that directory aren't either a GIF or a JPG file. I'm familiar with basic Python but this is the first time I've attempted anything like this and I'm looking for a little help or a point in the right direction?
>
>file_sigs = {'\xFF\xD8\xFF':('JPEG','jpg'), '\x47\x49\x46':('GIF','gif')}
Apparently you presume the file extensions are inaccurate, as you are
digging into the files for signatures.
>def readFile(): filename = r'c:/temp/downloads' fh = open(filename, 'r') file_sig = fh.read(4) print '[*] check_sig() File:',filename #, 'Hash Sig:', binascii.hexlify(file_sig)
Note: if you are hardcoding forward slashes, you don't need the raw
indicator...
That said, what is "c:/temp/downloads"? You apparently are opening IT
as the file to be examined. Is it supposed to be a directory containing
many files, a file containing a list of files, ???
What is "check_sig" -- it looks like a function you haven't defined --
but it's inside the quotes making a string literal that will never be
called anyway.
If you are just concerned with one directory of files, you might want
to read the help file on the glob module, along with os.path
(join/splitext/etc). Or just string methods...
>>> import glob
>>> import os.path
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... "documents/BW-conversion/*")
>>> TARGET = os.path.join(os.environ["USERPROFILE"],
... "documents/BW-conversion/*")
>>> files = glob.glob(TARGET)
>>> for fn in files:
... fp, fx = os.path.splitext(fn)
... print "File %s purports to be of type %s" % (fn, fx.upper())
...
File C:\Users\Wulfraed\documents/BW-conversion\BW-1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BW-4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\BWConv.html purports to be
of type .HTML
File C:\Users\Wulfraed\documents/BW-conversion\roo_b1.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b2.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b3.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b4.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b5.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_b6.jpg purports to be of
type .JPG
File C:\Users\Wulfraed\documents/BW-conversion\roo_col.jpg purports to be
of type .JPG
>>>
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
Back to top | Article view | comp.lang.python
csiph-web