Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #28606 > unrolled thread
| Started by | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| First post | 2012-09-06 07:56 -0700 |
| Last post | 2012-09-07 15:21 -0400 |
| Articles | 10 — 5 participants |
Back to article view | Back to comp.lang.python
Function for examine content of directory Tigerstyle <laddosingh@gmail.com> - 2012-09-06 07:56 -0700
Re: Function for examine content of directory Ian Foote <ian@feete.org> - 2012-09-06 16:06 +0100
Re: Function for examine content of directory MRAB <python@mrabarnett.plus.com> - 2012-09-06 16:20 +0100
Re: Function for examine content of directory Tigerstyle <laddosingh@gmail.com> - 2012-09-06 13:26 -0700
Re: Function for examine content of directory Tigerstyle <laddosingh@gmail.com> - 2012-09-06 13:26 -0700
Re: Function for examine content of directory Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-06 17:26 -0400
Re: Function for examine content of directory Chris Angelico <rosuav@gmail.com> - 2012-09-07 07:48 +1000
Re: Function for examine content of directory Tigerstyle <laddosingh@gmail.com> - 2012-09-07 07:23 -0700
Re: Function for examine content of directory Tigerstyle <laddosingh@gmail.com> - 2012-09-07 07:28 -0700
Re: Function for examine content of directory Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-07 15:21 -0400
| From | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| Date | 2012-09-06 07:56 -0700 |
| Subject | Function for examine content of directory |
| Message-ID | <feae1d5a-c477-4e21-8136-c14e38672cb9@googlegroups.com> |
Hi guys,
I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
This is the code so far:
--
import os
path = "v:\\workspace\\Python2_Homework03\\src\\"
dirs = os.listdir( path )
filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
extensions = []
for filename in filenames:
f = open(filename, "w")
f.write("Some text\n")
f.close()
name , ext = os.path.splitext(f.name)
extensions.append(ext)
# This would print all the files and directories
for file in dirs:
print(file)
for ext in extensions:
print("Count for %s: " %ext, extensions.count(ext))
--
When I'm trying to get the module to print how many files each extension has, it prints the count of each ext multiple times for each extension type. Like this:
this.pdf
the_other.txt
this.doc
that.txt
this.txt
that.pdf
first.txt
that.doc
Count for .pdf: 2
Count for .txt: 4
Count for .doc: 2
Count for .txt: 4
Count for .txt: 4
Count for .pdf: 2
Count for .txt: 4
Count for .doc: 2
Any help is appreciated.
T
[toc] | [next] | [standalone]
| From | Ian Foote <ian@feete.org> |
|---|---|
| Date | 2012-09-06 16:06 +0100 |
| Message-ID | <mailman.307.1346944006.27098.python-list@python.org> |
| In reply to | #28606 |
On 06/09/12 15:56, Tigerstyle wrote:
> Hi guys,
>
> I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
>
> This is the code so far:
> --
> import os
>
> path = "v:\\workspace\\Python2_Homework03\\src\\"
> dirs = os.listdir( path )
> filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
> extensions = []
Try using a set here instead of a list:
extensions = set()
> for filename in filenames:
> f = open(filename, "w")
> f.write("Some text\n")
> f.close()
> name , ext = os.path.splitext(f.name)
> extensions.append(ext)
and use:
extensions.add(ext)
This should take care of duplicates for you.
Regards,
Ian
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2012-09-06 16:20 +0100 |
| Message-ID | <mailman.309.1346944827.27098.python-list@python.org> |
| In reply to | #28606 |
On 06/09/2012 15:56, Tigerstyle wrote:
> Hi guys,
>
> I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
>
> This is the code so far:
> --
> import os
>
> path = "v:\\workspace\\Python2_Homework03\\src\\"
> dirs = os.listdir( path )
> filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
> extensions = []
> for filename in filenames:
> f = open(filename, "w")
> f.write("Some text\n")
> f.close()
> name , ext = os.path.splitext(f.name)
> extensions.append(ext)
>
> # This would print all the files and directories
> for file in dirs:
> print(file)
>
> for ext in extensions:
> print("Count for %s: " %ext, extensions.count(ext))
>
> --
>
> When I'm trying to get the module to print how many files each extension has, it prints the count of each ext multiple times for each extension type. Like this:
>
> this.pdf
> the_other.txt
> this.doc
> that.txt
> this.txt
> that.pdf
> first.txt
> that.doc
> Count for .pdf: 2
> Count for .txt: 4
> Count for .doc: 2
> Count for .txt: 4
> Count for .txt: 4
> Count for .pdf: 2
> Count for .txt: 4
> Count for .doc: 2
>
That's because each extension can occur multiple times in the list.
Try the Counter class:
from collections import Counter
for ext, count in Counter(extensions).items():
print("Count for %s: " % ext, count)
[toc] | [prev] | [next] | [standalone]
| From | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| Date | 2012-09-06 13:26 -0700 |
| Message-ID | <08938014-e4e0-4f6d-b11a-e843ed4a0da3@googlegroups.com> |
| In reply to | #28609 |
Thanks, just what I was looking for :-)
T
kl. 17:20:27 UTC+2 torsdag 6. september 2012 skrev MRAB følgende:
> On 06/09/2012 15:56, Tigerstyle wrote:
>
> > Hi guys,
>
> >
>
> > I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
>
> >
>
> > This is the code so far:
>
> > --
>
> > import os
>
> >
>
> > path = "v:\\workspace\\Python2_Homework03\\src\\"
>
> > dirs = os.listdir( path )
>
> > filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
>
> > extensions = []
>
> > for filename in filenames:
>
> > f = open(filename, "w")
>
> > f.write("Some text\n")
>
> > f.close()
>
> > name , ext = os.path.splitext(f.name)
>
> > extensions.append(ext)
>
> >
>
> > # This would print all the files and directories
>
> > for file in dirs:
>
> > print(file)
>
> >
>
> > for ext in extensions:
>
> > print("Count for %s: " %ext, extensions.count(ext))
>
> >
>
> > --
>
> >
>
> > When I'm trying to get the module to print how many files each extension has, it prints the count of each ext multiple times for each extension type. Like this:
>
> >
>
> > this.pdf
>
> > the_other.txt
>
> > this.doc
>
> > that.txt
>
> > this.txt
>
> > that.pdf
>
> > first.txt
>
> > that.doc
>
> > Count for .pdf: 2
>
> > Count for .txt: 4
>
> > Count for .doc: 2
>
> > Count for .txt: 4
>
> > Count for .txt: 4
>
> > Count for .pdf: 2
>
> > Count for .txt: 4
>
> > Count for .doc: 2
>
> >
>
> That's because each extension can occur multiple times in the list.
>
>
>
> Try the Counter class:
>
>
>
> from collections import Counter
>
>
>
> for ext, count in Counter(extensions).items():
>
> print("Count for %s: " % ext, count)
[toc] | [prev] | [next] | [standalone]
| From | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| Date | 2012-09-06 13:26 -0700 |
| Message-ID | <mailman.326.1346963199.27098.python-list@python.org> |
| In reply to | #28609 |
Thanks, just what I was looking for :-)
T
kl. 17:20:27 UTC+2 torsdag 6. september 2012 skrev MRAB følgende:
> On 06/09/2012 15:56, Tigerstyle wrote:
>
> > Hi guys,
>
> >
>
> > I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
>
> >
>
> > This is the code so far:
>
> > --
>
> > import os
>
> >
>
> > path = "v:\\workspace\\Python2_Homework03\\src\\"
>
> > dirs = os.listdir( path )
>
> > filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
>
> > extensions = []
>
> > for filename in filenames:
>
> > f = open(filename, "w")
>
> > f.write("Some text\n")
>
> > f.close()
>
> > name , ext = os.path.splitext(f.name)
>
> > extensions.append(ext)
>
> >
>
> > # This would print all the files and directories
>
> > for file in dirs:
>
> > print(file)
>
> >
>
> > for ext in extensions:
>
> > print("Count for %s: " %ext, extensions.count(ext))
>
> >
>
> > --
>
> >
>
> > When I'm trying to get the module to print how many files each extension has, it prints the count of each ext multiple times for each extension type. Like this:
>
> >
>
> > this.pdf
>
> > the_other.txt
>
> > this.doc
>
> > that.txt
>
> > this.txt
>
> > that.pdf
>
> > first.txt
>
> > that.doc
>
> > Count for .pdf: 2
>
> > Count for .txt: 4
>
> > Count for .doc: 2
>
> > Count for .txt: 4
>
> > Count for .txt: 4
>
> > Count for .pdf: 2
>
> > Count for .txt: 4
>
> > Count for .doc: 2
>
> >
>
> That's because each extension can occur multiple times in the list.
>
>
>
> Try the Counter class:
>
>
>
> from collections import Counter
>
>
>
> for ext, count in Counter(extensions).items():
>
> print("Count for %s: " % ext, count)
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2012-09-06 17:26 -0400 |
| Message-ID | <mailman.331.1346967004.27098.python-list@python.org> |
| In reply to | #28606 |
On Thu, 6 Sep 2012 07:56:29 -0700 (PDT), Tigerstyle
<laddosingh@gmail.com> declaimed the following in
gmane.comp.python.general:
> extensions.append(ext)
>
Don't append an ext if it is already in the list...
if ext not in extensions: extensions.append(ext)
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-09-07 07:48 +1000 |
| Message-ID | <mailman.333.1346968110.27098.python-list@python.org> |
| In reply to | #28606 |
On Fri, Sep 7, 2012 at 12:56 AM, Tigerstyle <laddosingh@gmail.com> wrote:
> I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
If you haven't already, look into the Python 'dict' type; you may find
it easier to work with for this sort of job. You can map an extension
("txt") to its count (4) directly.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| Date | 2012-09-07 07:23 -0700 |
| Message-ID | <1eb7dbea-cff5-4038-a441-d22707a4a7de@googlegroups.com> |
| In reply to | #28606 |
kl. 16:56:29 UTC+2 torsdag 6. september 2012 skrev Tigerstyle følgende:
> Hi guys,
>
>
>
> I'm trying to write a module containing a function to examine the contents of the current working directory and print out a count of how many files have each extension (".txt", ".doc", etc.)
>
>
>
> This is the code so far:
>
> --
>
> import os
>
>
>
> path = "v:\\workspace\\Python2_Homework03\\src\\"
>
> dirs = os.listdir( path )
>
> filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
>
> extensions = []
>
> for filename in filenames:
>
> f = open(filename, "w")
>
> f.write("Some text\n")
>
> f.close()
>
> name , ext = os.path.splitext(f.name)
>
> extensions.append(ext)
>
>
>
> # This would print all the files and directories
>
> for file in dirs:
>
> print(file)
>
>
>
> for ext in extensions:
>
> print("Count for %s: " %ext, extensions.count(ext))
>
>
>
> --
>
>
>
> When I'm trying to get the module to print how many files each extension has, it prints the count of each ext multiple times for each extension type. Like this:
>
>
>
> this.pdf
>
> the_other.txt
>
> this.doc
>
> that.txt
>
> this.txt
>
> that.pdf
>
> first.txt
>
> that.doc
>
> Count for .pdf: 2
>
> Count for .txt: 4
>
> Count for .doc: 2
>
> Count for .txt: 4
>
> Count for .txt: 4
>
> Count for .pdf: 2
>
> Count for .txt: 4
>
> Count for .doc: 2
>
>
>
> Any help is appreciated.
>
>
>
> T
[toc] | [prev] | [next] | [standalone]
| From | Tigerstyle <laddosingh@gmail.com> |
|---|---|
| Date | 2012-09-07 07:28 -0700 |
| Message-ID | <c93f63c5-e34f-4747-ab11-c18f9252e997@googlegroups.com> |
| In reply to | #28606 |
Ok I'm now totally stuck.
This is the code:
---
import os
from collections import Counter
path = ":c\\mypath\dir"
dirs = os.listdir( path )
filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
extensions = []
for filename in filenames:
f = open(filename, "w")
f.write("Some text\n")
f.close()
name , ext = os.path.splitext(f.name)
extensions.append(ext)
# This would print all the files and directories
for file in dirs:
print(file)
for ext, count in Counter(extensions).items():
print("Count for %s: " % ext, count)
---
I need to make this module into a function and write a separate module to verify by testing that the function gives correct results.
Help and pointers are much appreciated.
T
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2012-09-07 15:21 -0400 |
| Message-ID | <mailman.367.1347045722.27098.python-list@python.org> |
| In reply to | #28690 |
On Fri, 7 Sep 2012 07:28:03 -0700 (PDT), Tigerstyle
<laddosingh@gmail.com> declaimed the following in
gmane.comp.python.general:
> Ok I'm now totally stuck.
>
> This is the code:
>
This code is full of errors...
> ---
> import os
> from collections import Counter
>
> path = ":c\\mypath\dir"
Not a valid Windows path. The format should be "c:\mypath\dir"
(actually, to use \ you should probably declare it a raw string -- much
simpler, since all the python/OS functions don't care, is to use / -- as
in "c:/mypath/dir")
> dirs = os.listdir( path )
Warning, this will also list items that are not files (like
subdirectories). (hence "dirs" is a misleading name)
> filenames = {"this.txt", "that.txt", "the_other.txt","this.doc","that.doc","this.pdf","first.txt","that.pdf"}
> extensions = []
> for filename in filenames:
> f = open(filename, "w")
> f.write("Some text\n")
> f.close()
> name , ext = os.path.splitext(f.name)
> extensions.append(ext)
>
> # This would print all the files and directories
> for file in dirs:
> print(file)
This prints the file/directory /name/
NOTE: you grabbed the list of names BEFORE you created your test
data files, so...
>
>
>
> for ext, count in Counter(extensions).items():
> print("Count for %s: " % ext, count)
>
... this is not really a count of files grouped by extension IN the
directory -- this is only the count based on the file names you defined
to be created.
I'm not going to create test files, nor a test suite, and what I
have done is still too much... but...
-=-=-=-=-
import os
import collections
PATH = "e:/userdata/wulfraed/my documents/python progs"
fids = os.listdir(PATH)
fids.sort()
nmlen = max([len(f) for f in fids])
format = "%%%ss %%10s" % nmlen
cntr = collections.Counter()
for fid in fids:
prefix, ext = os.path.splitext(fid)
print format % (prefix, ext)
cntr.update([ext])
print "\n\n"
for ext, cnt in cntr.items():
print "%10s %10s" % (ext, cnt)
-=-=-=-=-
.project
.pydevproject
.settings
ABA .py
ADC .py
BookList .zip
CGIServer
DGen .py
DiskCatalog .py
DiskCatalog .pyc
Dload .py
Firearms .csv
GWhist .py
HTML .py
Hanoi .py
Hanoi .pyc
HierHead .py
Intervals .py
MBX_Split .py
MySQLTest .py
MySQLTest .pyc
MySQLdb .html
MySQLdb_files
NIM1 .py
NumberPrinter .py
PhotoFrame .py
Probability .py
ProgressBar .py
ProgressBar2 .py
RandomScores .py
SQL .py
SQLiteTest .py
SampleData .txt
SampleFormat .tsv
Script1 .py
Script2 .py
Script3 .py
Script3 .pyc
Sociable_Chain .py
Sociable_Chain .pyc
Stereo .py
TAGS .py
azel_interp .py
binadd .py
binadd2 .py
bsddb-test .py
cgiform .py
chessclock .py
counter .py
counterthread .py
cp .py
data .txt
databasetest .py
databasetest2 .py
dbfail .py
dbg .py
dbg .pyc
dbtst .py
dirwalk .py
execsub .py
extractor .py
filecnt .py
filter .py
fulldicttest .py
h2b .py
h2b .pyc
headers .py
highScore .py
htmlparse .py
i2b .py
i2b .pyc
infile1 .tsv
infile2 .tsv
infile3 .tsv
int2wrd .py
int2wrd .pyc
int2wrd2 .py
int2wrd2 .pyc
intervalfile .txt
invoice .csv
junk .py
justify .py
linkedlist .py
llist .py
main .py
make_ou_class .py
make_ou_class .pyc
mileage .py
minmax .py
mofn .py
mofn.py .zip
movefiles .py
moving .py
mptest1 .py
myhtmlparser .py
myhtmlparser .pyc
mytest .py
mytest .pyc
node .py
node .pyc
pcdtojpeg .py
pst .py
queens1 .py
queens2 .py
queens2.py .zip
query .py
railroad .py
rpg .py
run .py
s .txt
sample .tsv
scramble .py
scratch .db
script1 .html
script1 .sql
script2 .html
setuptools-0.6c6-py2.4 .egg
sgml .py
spam .py
sqltest .py
sqrot .py
src
sub .py
sub_p1 .py
sub_p3 .py
sudoku .py
sudoku.py .bak
sudoku .pyc
summup_dict1
summup_dict2
summup_dict2b
summup_dict3
summup_list
t .dat
t .py
tabspace .py
tabspace .pyc
tdriver .py
test .csd
test .db
test .sql
test .txt
testABA .py
testABA .pyc
tgsetup .py
thread .py
threadsample .py
threadswap .py
timetest .py
timing .py
trips .dat
update_log
ut_00 .py
wordprob .py
12
.pyc 17
.bak 1
.sql 2
.tsv 5
.csv 2
.db 2
.dat 2
.py 98
.txt 5
.html 3
.csd 1
.egg 1
.zip 3
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web