Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #64090 > unrolled thread
| Started by | Xaxa Urtiz <urtizvereaxaxa@gmail.com> |
|---|---|
| First post | 2014-01-16 08:49 -0800 |
| Last post | 2014-01-16 18:45 +0000 |
| Articles | 7 — 4 participants |
Back to article view | Back to comp.lang.python
Python glob and raw string Xaxa Urtiz <urtizvereaxaxa@gmail.com> - 2014-01-16 08:49 -0800
Re: Python glob and raw string Xaxa Urtiz <urtizvereaxaxa@gmail.com> - 2014-01-16 10:03 -0800
Re: Python glob and raw string Neil Cerutti <neilc@norwich.edu> - 2014-01-16 18:14 +0000
Re: Python glob and raw string Xaxa Urtiz <urtizvereaxaxa@gmail.com> - 2014-01-17 08:45 -0800
Re: Python glob and raw string Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-17 17:38 +0000
Re: Python glob and raw string Chris Angelico <rosuav@gmail.com> - 2014-01-17 05:19 +1100
Re: Python glob and raw string Neil Cerutti <neilc@norwich.edu> - 2014-01-16 18:45 +0000
| From | Xaxa Urtiz <urtizvereaxaxa@gmail.com> |
|---|---|
| Date | 2014-01-16 08:49 -0800 |
| Subject | Python glob and raw string |
| Message-ID | <6d3dd7d7-6836-4b55-9d1c-9e70e18b66fd@googlegroups.com> |
Hello everybody, i've got a little problem, i've made a script which look after some files in some directory, typically my folder are organized like this :
[share]
folder1
->20131201
-->file1.xml
-->file2.txt
->20131202
-->file9696009.tmp
-->file421378932.xml
etc....
so basically in the share i've got some folder (=folder1,folder2.....) and inside these folder i've got these folder whose name is the date (20131201,20131202,20131203 etc...) and inside them i want to find all the xml files.
So, what i've done is to iterate over all the folder1/2/3 that i want and look, for each one, the xml file with that:
for f in glob.glob(dir +r"\20140115\*.xml"):
->yield f
dir is the folder1/2/3 everything is ok but i want to do something like that :
for i in range(10,16):
->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
-->yield f
but the glob does not find any file.... (and of course there is some xml and the old way found them...)
Any help would be appreciate :)
[toc] | [next] | [standalone]
| From | Xaxa Urtiz <urtizvereaxaxa@gmail.com> |
|---|---|
| Date | 2014-01-16 10:03 -0800 |
| Message-ID | <1d76e596-fa6e-44c2-aefd-391788a4fd0d@googlegroups.com> |
| In reply to | #64090 |
Le jeudi 16 janvier 2014 17:49:57 UTC+1, Xaxa Urtiz a écrit :
> Hello everybody, i've got a little problem, i've made a script which look after some files in some directory, typically my folder are organized like this :
>
>
>
> [share]
>
> folder1
>
> ->20131201
>
> -->file1.xml
>
> -->file2.txt
>
> ->20131202
>
> -->file9696009.tmp
>
> -->file421378932.xml
>
> etc....
>
> so basically in the share i've got some folder (=folder1,folder2.....) and inside these folder i've got these folder whose name is the date (20131201,20131202,20131203 etc...) and inside them i want to find all the xml files.
>
> So, what i've done is to iterate over all the folder1/2/3 that i want and look, for each one, the xml file with that:
>
>
>
>
>
> for f in glob.glob(dir +r"\20140115\*.xml"):
>
> ->yield f
>
>
>
> dir is the folder1/2/3 everything is ok but i want to do something like that :
>
>
>
>
>
> for i in range(10,16):
>
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
>
> -->yield f
>
>
>
> but the glob does not find any file.... (and of course there is some xml and the old way found them...)
>
> Any help would be appreciate :)
I feel stupid, my mistake, it works :
for i in range(1,16):
->for f in glob.glob(dir +r"\201401{0:02}\*.xml".format(i)):
-->yield f
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2014-01-16 18:14 +0000 |
| Message-ID | <mailman.5596.1389896098.18130.python-list@python.org> |
| In reply to | #64090 |
On 2014-01-16, Xaxa Urtiz <urtizvereaxaxa@gmail.com> wrote:
> Hello everybody, i've got a little problem, i've made a script
> which look after some files in some directory, typically my
> folder are organized like this :
>
> [share]
> folder1
> ->20131201
> -->file1.xml
> -->file2.txt
> ->20131202
> -->file9696009.tmp
> -->file421378932.xml
> etc....
> so basically in the share i've got some folder
> (=folder1,folder2.....) and inside these folder i've got these
> folder whose name is the date (20131201,20131202,20131203
> etc...) and inside them i want to find all the xml files.
> So, what i've done is to iterate over all the folder1/2/3 that
> i want and look, for each one, the xml file with that:
>
> for f in glob.glob(dir +r"\20140115\*.xml"):
> ->yield f
>
> dir is the folder1/2/3 everything is ok but i want to do
> something like that :
>
> for i in range(10,16):
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
> -->yield f
>
> but the glob does not find any file.... (and of course there is
> some xml and the old way found them...)
> Any help would be appreciate :)
I've done this two different ways. The simple way is very similar
to what you are now doing. It sucks because I have to manually
maintain the list of subdirectories to traverse every time I
create a new subdir.
Here's the other way, using glob and isdir from os.path, adapted
from actual production code.
class Miner:
def __init__(self, archive):
# setup goes here; prepare to acquire the data
self.descend(os.path.join(archive, '*'))
def descend(self, path):
for fname in glob.glob(os.path.join(path, '*')):
if os.path.isdir(fname):
self.descend(fname)
else:
self.process(fname)
def process(self, path):
# Do what I want done with an actual file path.
# This is where I add to the data.
In your case you might not want to process unless the path also
looks like an xml file.
mine = Miner('myxmldir')
Hmmm... I might be doing too much in __init__. ;)
--
Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Xaxa Urtiz <urtizvereaxaxa@gmail.com> |
|---|---|
| Date | 2014-01-17 08:45 -0800 |
| Message-ID | <78b3f119-4b8b-4c4c-b54e-c9078736869c@googlegroups.com> |
| In reply to | #64096 |
Le jeudi 16 janvier 2014 19:14:30 UTC+1, Neil Cerutti a écrit :
> On 2014-01-16, Xaxa Urtiz <> wrote:
>
> > Hello everybody, i've got a little problem, i've made a script
>
> > which look after some files in some directory, typically my
>
> > folder are organized like this :
>
> >
>
> > [share]
>
> > folder1
>
> > ->20131201
>
> > -->file1.xml
>
> > -->file2.txt
>
> > ->20131202
>
> > -->file9696009.tmp
>
> > -->file421378932.xml
>
> > etc....
>
> > so basically in the share i've got some folder
>
> > (=folder1,folder2.....) and inside these folder i've got these
>
> > folder whose name is the date (20131201,20131202,20131203
>
> > etc...) and inside them i want to find all the xml files.
>
> > So, what i've done is to iterate over all the folder1/2/3 that
>
> > i want and look, for each one, the xml file with that:
>
> >
>
> > for f in glob.glob(dir +r"\20140115\*.xml"):
>
> > ->yield f
>
> >
>
> > dir is the folder1/2/3 everything is ok but i want to do
>
> > something like that :
>
> >
>
> > for i in range(10,16):
>
> > ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
>
> > -->yield f
>
> >
>
> > but the glob does not find any file.... (and of course there is
>
> > some xml and the old way found them...)
>
> > Any help would be appreciate :)
>
>
>
> I've done this two different ways. The simple way is very similar
>
> to what you are now doing. It sucks because I have to manually
>
> maintain the list of subdirectories to traverse every time I
>
> create a new subdir.
>
>
>
> Here's the other way, using glob and isdir from os.path, adapted
>
> from actual production code.
>
>
>
> class Miner:
>
> def __init__(self, archive):
>
> # setup goes here; prepare to acquire the data
>
> self.descend(os.path.join(archive, '*'))
>
>
>
> def descend(self, path):
>
> for fname in glob.glob(os.path.join(path, '*')):
>
> if os.path.isdir(fname):
>
> self.descend(fname)
>
> else:
>
> self.process(fname)
>
>
>
> def process(self, path):
>
> # Do what I want done with an actual file path.
>
> # This is where I add to the data.
>
>
>
> In your case you might not want to process unless the path also
>
> looks like an xml file.
>
>
>
> mine = Miner('myxmldir')
>
>
>
> Hmmm... I might be doing too much in __init__. ;)
>
>
>
> --
>
> Neil Cerutti
i only have 1 level of subdirectory, it's just in the case when i don't want to process all the date (otherwise i make a glob on '/*/*', no need to do any recursion.
thanks for the answer !
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-01-17 17:38 +0000 |
| Message-ID | <mailman.5652.1389980335.18130.python-list@python.org> |
| In reply to | #64178 |
On 17/01/2014 16:45, Xaxa Urtiz wrote: [masses of double spaced lines snipped] Would you please read and action this https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the double line spacing in your posts, thanks. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-01-17 05:19 +1100 |
| Message-ID | <mailman.5597.1389896390.18130.python-list@python.org> |
| In reply to | #64090 |
On Fri, Jan 17, 2014 at 5:14 AM, Neil Cerutti <neilc@norwich.edu> wrote:
> class Miner:
> def __init__(self, archive):
> # setup goes here; prepare to acquire the data
> self.descend(os.path.join(archive, '*'))
>
> def descend(self, path):
> for fname in glob.glob(os.path.join(path, '*')):
> if os.path.isdir(fname):
> self.descend(fname)
> else:
> self.process(fname)
>
> def process(self, path):
> # Do what I want done with an actual file path.
> # This is where I add to the data.
>
> In your case you might not want to process unless the path also
> looks like an xml file.
>
> mine = Miner('myxmldir')
>
> Hmmm... I might be doing too much in __init__. ;)
Hmm, why is it even a class? :) I guess you elided all the stuff that
makes it impractical to just use a non-class function.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2014-01-16 18:45 +0000 |
| Message-ID | <mailman.5598.1389897976.18130.python-list@python.org> |
| In reply to | #64090 |
On 2014-01-16, Chris Angelico <rosuav@gmail.com> wrote: >> Hmmm... I might be doing too much in __init__. ;) > > Hmm, why is it even a class? :) I guess you elided all the > stuff that makes it impractical to just use a non-class > function. I didn't remove anything that makes it obviously class-worthy, just timestamp checking, and several dicts and sets to store data. The original version of that code is just a set of three functions, but the return result of that version was a single dict. Once the return value got complicated enough to require building up a class instance, it became a convenient place to hang the functions. -- Neil Cerutti
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web