Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #88262 > unrolled thread
| Started by | Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> |
|---|---|
| First post | 2015-03-29 04:32 -0700 |
| Last post | 2015-04-03 19:16 -0700 |
| Articles | 20 on this page of 24 — 9 participants |
Back to article view | Back to comp.lang.python
Strategy/ Advice for How to Best Attack this Problem? Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> - 2015-03-29 04:32 -0700
Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> - 2015-03-29 04:37 -0700
Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Peter Otten <__peter__@web.de> - 2015-03-29 14:33 +0200
Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-29 05:41 -0700
Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-01 07:17 -0700
Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-29 09:27 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-03-29 19:00 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Paul Rubin <no.email@nospam.invalid> - 2015-03-29 18:08 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Chris Angelico <rosuav@gmail.com> - 2015-03-30 13:04 +1100
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-30 09:45 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-30 14:35 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-31 04:00 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-31 09:19 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-01 06:43 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-01 19:51 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-02 06:06 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-02 17:10 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-02 16:43 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Peter Otten <__peter__@web.de> - 2015-04-03 12:45 +0200
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-03 06:40 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <d@davea.name> - 2015-04-03 07:59 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-03 05:50 -0700
Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-03 16:21 -0400
Re: Strategy/ Advice for How to Best Attack this Problem? Rustom Mody <rustompmody@gmail.com> - 2015-04-03 19:16 -0700
Page 1 of 2 [1] 2 Next page →
| From | Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-03-29 04:32 -0700 |
| Subject | Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <b803183b-6da5-4907-825d-9386b2bc2798@googlegroups.com> |
Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas? * Monitors a folder for files that are dropped throughout the day * When a file is dropped in the folder the program should scan the file o IF all the records in the file have the same length o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed o IF the file is empty OR the records are not all of the same length o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).
[toc] | [next] | [standalone]
| From | Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-03-29 04:37 -0700 |
| Subject | Addendum to Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <c933b8de-15b6-45b2-9fa7-2011a3615921@googlegroups.com> |
| In reply to | #88262 |
On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
>
> * Monitors a folder for files that are dropped throughout the day
>
> * When a file is dropped in the folder the program should scan the file
>
> o IF all the records in the file have the same length
>
> o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
>
> o IF the file is empty OR the records are not all of the same length
>
> o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).
Below are some functions that I have been playing around with. I am not sure how to create a functional program from each of these constituent parts. I could use decorators or simply pass a function within another function.
[code]
import time
import fnmatch
import os
import shutil
#If you want to write to a file, and if it doesn't exist, do this:
if not os.path.exists(filepath):
f = open(filepath, 'w')
#If you want to read a file, and if it exists, do the following:
try:
f = open(filepath)
except IOError:
print 'I will be moving this to the '
#Changing a directory to "/home/newdir"
os.chdir("/home/newdir")
def move(src, dest):
shutil.move(src, dest)
def fileinfo(file):
filename = os.path.basename(file)
rootdir = os.path.dirname(file)
lastmod = time.ctime(os.path.getmtime(file))
creation = time.ctime(os.path.getctime(file))
filesize = os.path.getsize(file)
print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)
searchdir = r'D:\Your\Directory\Root'
matches = []
def search
for root, dirnames, filenames in os.walk(searchdir):
## for filename in fnmatch.filter(filenames, '*.c'):
for filename in filenames:
## matches.append(os.path.join(root, filename))
##print matches
fileinfo(os.path.join(root, filename))
def get_files(src_dir):
# traverse root directory, and list directories as dirs and files as files
for root, dirs, files in os.walk(src_dir):
path = root.split('/')
for file in files:
process(os.path.join(root, file))
os.remove(os.path.join(root, file))
def del_dirs(src_dir):
for dirpath, _, _ in os.walk(src_dir, topdown=False): # Listing the files
if dirpath == src_dir:
break
try:
os.rmdir(dirpath)
except OSError as ex:
print(ex)
def main():
get_files(src_dir)
del_dirs(src_dir)
if __name__ == "__main__":
main()
[/code]
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-03-29 14:33 +0200 |
| Subject | Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <mailman.307.1427632400.10327.python-list@python.org> |
| In reply to | #88263 |
Saran Ahluwalia wrote:
> On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
>> Below are the function's requirements. I am torn between using the OS
>> module or some other quick and dirty module. In addition, my ideal
>> assumption that this could be cross-platform. "Records" refers to
>> contents in a file. What are some suggestions from the Pythonistas?
>>
>> * Monitors a folder for files that are dropped throughout the day
>>
>> * When a file is dropped in the folder the program should scan the file
>>
>> o IF all the records in the file have the same length
>>
>> o THEN the file should be moved to a "success" folder and a text file
>> written indicating the total number of records processed
>>
>> o IF the file is empty OR the records are not all of the same length
>>
>> o THEN the file should be moved to a "failure" folder and a text file
>> written indicating the cause for failure (for example: Empty file or line
>> 100 was not the same length as the rest).
>
> Below are some functions that I have been playing around with. I am not
> sure how to create a functional program from each of these constituent
> parts. I could use decorators or simply pass a function within another
> function.
Throwing arbitrary code at a task in the hope that something sticks is not a
good approach. You already have given a clear description of the problem, so
start with that and try to "pythonize" it. Example:
def main():
while True:
files_to_check = get_files_in_monitored_folder()
for file in files_to_check:
if is_good(file):
move_to_success_folder(file)
else:
move_to_failure_folder(file)
wait_a_minute()
if __name__ == "__main__":
main()
Then write bogus implementations for the building blocks:
def get_files_in_monitored_folder():
return ["/foo/bar/ham", "/foo/bar/spam"]
def is_good(file):
return file.endswith("/ham")
def move_to_failure_folder(file):
print("failure", file)
def move_to_success_folder(file):
print("success", file)
def wait_a_minute():
raise SystemExit("bye") # we don't want to enter the loop while
developing
Now successively replace the dummy function with functions that do the right
thing. Test them individually (preferrably using unit tests) so that when
you have completed them all and your program does not work like it should
you can be sure that there is a flaw in the main function.
> [code]
> import time
> import fnmatch
> import os
> import shutil
>
>
> #If you want to write to a file, and if it doesn't exist, do this:
Hm, are these your personal notes or is it an actual script?
> if not os.path.exists(filepath):
> f = open(filepath, 'w')
> #If you want to read a file, and if it exists, do the following:
>
> try:
> f = open(filepath)
> except IOError:
> print 'I will be moving this to the '
>
>
> #Changing a directory to "/home/newdir"
> os.chdir("/home/newdir")
Never using os.chdir() is a good habit to get into.
> def move(src, dest):
> shutil.move(src, dest)
>
> def fileinfo(file):
> filename = os.path.basename(file)
> rootdir = os.path.dirname(file)
> lastmod = time.ctime(os.path.getmtime(file))
> creation = time.ctime(os.path.getctime(file))
> filesize = os.path.getsize(file)
>
> print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
> filesize)
>
> searchdir = r'D:\Your\Directory\Root'
> matches = []
>
> def search
Everytime you post code that doesn't even compile you lose some goodwill.
In a few lines of code meant to demonstrate a problem a typo may be
acceptable, but for something that you probably composed in an editor you
should take the time to run it and fix at least the syntax errors.
> for root, dirnames, filenames in os.walk(searchdir):
> ## for filename in fnmatch.filter(filenames, '*.c'):
> for filename in filenames:
> ## matches.append(os.path.join(root, filename))
> ##print matches
> fileinfo(os.path.join(root, filename))
>
>
> def get_files(src_dir):
> # traverse root directory, and list directories as dirs and files as files
> for root, dirs, files in os.walk(src_dir):
> path = root.split('/')
> for file in files:
> process(os.path.join(root, file))
> os.remove(os.path.join(root, file))
>
> def del_dirs(src_dir):
> for dirpath, _, _ in os.walk(src_dir, topdown=False): # Listing the
> files
> if dirpath == src_dir:
> break
> try:
> os.rmdir(dirpath)
> except OSError as ex:
> print(ex)
>
>
> def main():
> get_files(src_dir)
> del_dirs(src_dir)
>
>
> if __name__ == "__main__":
> main()
>
>
> [/code]
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-03-29 05:41 -0700 |
| Subject | Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <03e661d2-db49-4d2b-8524-3f5b3a07d1b8@googlegroups.com> |
| In reply to | #88267 |
Thank you for the feedback - I have only been programming for 8 months now and am fairly new to best practice and what is considered acceptable in public forums. I appreciate the feedback on how to best address this problem.
On Sunday, March 29, 2015 at 8:33:43 AM UTC-4, Peter Otten wrote:
> Saran Ahluwalia wrote:
>
> > On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> >> Below are the function's requirements. I am torn between using the OS
> >> module or some other quick and dirty module. In addition, my ideal
> >> assumption that this could be cross-platform. "Records" refers to
> >> contents in a file. What are some suggestions from the Pythonistas?
> >>
> >> * Monitors a folder for files that are dropped throughout the day
> >>
> >> * When a file is dropped in the folder the program should scan the file
> >>
> >> o IF all the records in the file have the same length
> >>
> >> o THEN the file should be moved to a "success" folder and a text file
> >> written indicating the total number of records processed
> >>
> >> o IF the file is empty OR the records are not all of the same length
> >>
> >> o THEN the file should be moved to a "failure" folder and a text file
> >> written indicating the cause for failure (for example: Empty file or line
> >> 100 was not the same length as the rest).
> >
> > Below are some functions that I have been playing around with. I am not
> > sure how to create a functional program from each of these constituent
> > parts. I could use decorators or simply pass a function within another
> > function.
>
> Throwing arbitrary code at a task in the hope that something sticks is not a
> good approach. You already have given a clear description of the problem, so
> start with that and try to "pythonize" it. Example:
>
> def main():
> while True:
> files_to_check = get_files_in_monitored_folder()
> for file in files_to_check:
> if is_good(file):
> move_to_success_folder(file)
> else:
> move_to_failure_folder(file)
> wait_a_minute()
>
> if __name__ == "__main__":
> main()
>
> Then write bogus implementations for the building blocks:
>
> def get_files_in_monitored_folder():
> return ["/foo/bar/ham", "/foo/bar/spam"]
>
> def is_good(file):
> return file.endswith("/ham")
>
> def move_to_failure_folder(file):
> print("failure", file)
>
> def move_to_success_folder(file):
> print("success", file)
>
> def wait_a_minute():
> raise SystemExit("bye") # we don't want to enter the loop while
> developing
>
> Now successively replace the dummy function with functions that do the right
> thing. Test them individually (preferrably using unit tests) so that when
> you have completed them all and your program does not work like it should
> you can be sure that there is a flaw in the main function.
>
> > [code]
> > import time
> > import fnmatch
> > import os
> > import shutil
> >
> >
> > #If you want to write to a file, and if it doesn't exist, do this:
>
> Hm, are these your personal notes or is it an actual script?
>
> > if not os.path.exists(filepath):
> > f = open(filepath, 'w')
> > #If you want to read a file, and if it exists, do the following:
> >
> > try:
> > f = open(filepath)
> > except IOError:
> > print 'I will be moving this to the '
> >
> >
> > #Changing a directory to "/home/newdir"
> > os.chdir("/home/newdir")
>
> Never using os.chdir() is a good habit to get into.
>
> > def move(src, dest):
> > shutil.move(src, dest)
> >
> > def fileinfo(file):
> > filename = os.path.basename(file)
> > rootdir = os.path.dirname(file)
> > lastmod = time.ctime(os.path.getmtime(file))
> > creation = time.ctime(os.path.getctime(file))
> > filesize = os.path.getsize(file)
> >
> > print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
> > filesize)
> >
> > searchdir = r'D:\Your\Directory\Root'
> > matches = []
> >
> > def search
>
> Everytime you post code that doesn't even compile you lose some goodwill.
> In a few lines of code meant to demonstrate a problem a typo may be
> acceptable, but for something that you probably composed in an editor you
> should take the time to run it and fix at least the syntax errors.
>
> > for root, dirnames, filenames in os.walk(searchdir):
> > ## for filename in fnmatch.filter(filenames, '*.c'):
> > for filename in filenames:
> > ## matches.append(os.path.join(root, filename))
> > ##print matches
> > fileinfo(os.path.join(root, filename))
> >
> >
> > def get_files(src_dir):
> > # traverse root directory, and list directories as dirs and files as files
> > for root, dirs, files in os.walk(src_dir):
> > path = root.split('/')
> > for file in files:
> > process(os.path.join(root, file))
> > os.remove(os.path.join(root, file))
> >
> > def del_dirs(src_dir):
> > for dirpath, _, _ in os.walk(src_dir, topdown=False): # Listing the
> > files
> > if dirpath == src_dir:
> > break
> > try:
> > os.rmdir(dirpath)
> > except OSError as ex:
> > print(ex)
> >
> >
> > def main():
> > get_files(src_dir)
> > del_dirs(src_dir)
> >
> >
> > if __name__ == "__main__":
> > main()
> >
> >
> > [/code]
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-04-01 07:17 -0700 |
| Subject | Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <fc09b6ca-3207-4324-a821-eedd03feecc5@googlegroups.com> |
| In reply to | #88267 |
On Sunday, March 29, 2015 at 8:33:43 AM UTC-4, Peter Otten wrote:
> Saran Ahluwalia wrote:
>
> > On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> >> Below are the function's requirements. I am torn between using the OS
> >> module or some other quick and dirty module. In addition, my ideal
> >> assumption that this could be cross-platform. "Records" refers to
> >> contents in a file. What are some suggestions from the Pythonistas?
> >>
> >> * Monitors a folder for files that are dropped throughout the day
> >>
> >> * When a file is dropped in the folder the program should scan the file
> >>
> >> o IF all the records in the file have the same length
> >>
> >> o THEN the file should be moved to a "success" folder and a text file
> >> written indicating the total number of records processed
> >>
> >> o IF the file is empty OR the records are not all of the same length
> >>
> >> o THEN the file should be moved to a "failure" folder and a text file
> >> written indicating the cause for failure (for example: Empty file or line
> >> 100 was not the same length as the rest).
> >
> > Below are some functions that I have been playing around with. I am not
> > sure how to create a functional program from each of these constituent
> > parts. I could use decorators or simply pass a function within another
> > function.
>
> Throwing arbitrary code at a task in the hope that something sticks is not a
> good approach. You already have given a clear description of the problem, so
> start with that and try to "pythonize" it. Example:
>
> def main():
> while True:
> files_to_check = get_files_in_monitored_folder()
> for file in files_to_check:
> if is_good(file):
> move_to_success_folder(file)
> else:
> move_to_failure_folder(file)
> wait_a_minute()
>
> if __name__ == "__main__":
> main()
>
> Then write bogus implementations for the building blocks:
>
> def get_files_in_monitored_folder():
> return ["/foo/bar/ham", "/foo/bar/spam"]
>
> def is_good(file):
> return file.endswith("/ham")
>
> def move_to_failure_folder(file):
> print("failure", file)
>
> def move_to_success_folder(file):
> print("success", file)
>
> def wait_a_minute():
> raise SystemExit("bye") # we don't want to enter the loop while
> developing
>
> Now successively replace the dummy function with functions that do the right
> thing. Test them individually (preferrably using unit tests) so that when
> you have completed them all and your program does not work like it should
> you can be sure that there is a flaw in the main function.
>
> > [code]
> > import time
> > import fnmatch
> > import os
> > import shutil
> >
> >
> > #If you want to write to a file, and if it doesn't exist, do this:
>
> Hm, are these your personal notes or is it an actual script?
>
> > if not os.path.exists(filepath):
> > f = open(filepath, 'w')
> > #If you want to read a file, and if it exists, do the following:
> >
> > try:
> > f = open(filepath)
> > except IOError:
> > print 'I will be moving this to the '
> >
> >
> > #Changing a directory to "/home/newdir"
> > os.chdir("/home/newdir")
>
> Never using os.chdir() is a good habit to get into.
>
> > def move(src, dest):
> > shutil.move(src, dest)
> >
> > def fileinfo(file):
> > filename = os.path.basename(file)
> > rootdir = os.path.dirname(file)
> > lastmod = time.ctime(os.path.getmtime(file))
> > creation = time.ctime(os.path.getctime(file))
> > filesize = os.path.getsize(file)
> >
> > print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
> > filesize)
> >
> > searchdir = r'D:\Your\Directory\Root'
> > matches = []
> >
> > def search
>
> Everytime you post code that doesn't even compile you lose some goodwill.
> In a few lines of code meant to demonstrate a problem a typo may be
> acceptable, but for something that you probably composed in an editor you
> should take the time to run it and fix at least the syntax errors.
>
> > for root, dirnames, filenames in os.walk(searchdir):
> > ## for filename in fnmatch.filter(filenames, '*.c'):
> > for filename in filenames:
> > ## matches.append(os.path.join(root, filename))
> > ##print matches
> > fileinfo(os.path.join(root, filename))
> >
> >
> > def get_files(src_dir):
> > # traverse root directory, and list directories as dirs and files as files
> > for root, dirs, files in os.walk(src_dir):
> > path = root.split('/')
> > for file in files:
> > process(os.path.join(root, file))
> > os.remove(os.path.join(root, file))
> >
> > def del_dirs(src_dir):
> > for dirpath, _, _ in os.walk(src_dir, topdown=False): # Listing the
> > files
> > if dirpath == src_dir:
> > break
> > try:
> > os.rmdir(dirpath)
> > except OSError as ex:
> > print(ex)
> >
> >
> > def main():
> > get_files(src_dir)
> > del_dirs(src_dir)
> >
> >
> > if __name__ == "__main__":
> > main()
> >
> >
> > [/code]
@PeterOtten
Thank you for the feedback - since your last correspondence I have come up with this (my most recent commit:
https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-29 09:27 -0400 |
| Subject | Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? |
| Message-ID | <mailman.308.1427635674.10327.python-list@python.org> |
| In reply to | #88263 |
On 03/29/2015 07:37 AM, Saran Ahluwalia wrote:
> On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
>> Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
>>
>> * Monitors a folder for files that are dropped throughout the day
>>
>> * When a file is dropped in the folder the program should scan the file
>>
>> o IF all the records in the file have the same length
>>
>> o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
>>
>> o IF the file is empty OR the records are not all of the same length
>>
>> o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).
>
> Below are some functions that I have been playing around with. I am not sure how to create a functional program from each of these constituent parts. I could use decorators or simply pass a function within another function.
Your problem isn't complicated enough to either need function objects or
decorators. You might want to write a generator function, but even that
seems overkill for the problem as stated. Just write the code,
top-down, with dummy bodies containing stub code. Then fill it in from
bottom up, with unit tests for each completed function.
More complex problems can justify a different approach, but you don't
need to use every trick in the arsenal.
>
> [code]
> import time
> import fnmatch
> import os
> import shutil
>
If you have code fragments that aren't going to be used, don't write
them as top-level code. Either move them to another file, or at least
enclose them in a function with a name like dummy_do_not_use()
My own convention for that is to suffix the function name with a bunch
of uppercase ZZZ's That way the name jumps out at me so I'll recognize
it, and I can be sure I'll never actually call it.
>
> #If you want to write to a file, and if it doesn't exist, do this:
>
> if not os.path.exists(filepath):
> f = open(filepath, 'w')
>
> #If you want to read a file, and if it exists, do the following:
>
> try:
> f = open(filepath)
> except IOError:
> print 'I will be moving this to the '
>
>
> #Changing a directory to "/home/newdir"
> os.chdir("/home/newdir")
As Peter said, chdir can be very troublesome. Avoid at almost all
costs. As you've done elsewhere, use os.path.join() to combine
directory paths with relative filenames.
>
> def move(src, dest):
> shutil.move(src, dest)
>
> def fileinfo(file):
> filename = os.path.basename(file)
> rootdir = os.path.dirname(file)
> lastmod = time.ctime(os.path.getmtime(file))
> creation = time.ctime(os.path.getctime(file))
> filesize = os.path.getsize(file)
>
> print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)
>
> searchdir = r'D:\Your\Directory\Root'
> matches = []
>
> def search
> for root, dirnames, filenames in os.walk(searchdir):
Why are you using a directory tree when your "spec" said the files would
be in a specific directory?
> ## for filename in fnmatch.filter(filenames, '*.c'):
> for filename in filenames:
> ## matches.append(os.path.join(root, filename))
> ##print matches
> fileinfo(os.path.join(root, filename))
>
>
> def get_files(src_dir):
> # traverse root directory, and list directories as dirs and files as files
> for root, dirs, files in os.walk(src_dir):
> path = root.split('/')
> for file in files:
> process(os.path.join(root, file))
> os.remove(os.path.join(root, file))
Probably you shouldn't have os.remove in the code till the stuff around
it has been carefully tested. Besides, nothing in the spec says you're
going to remove any files.
>
> def del_dirs(src_dir):
> for dirpath, _, _ in os.walk(src_dir, topdown=False): # Listing the files
> if dirpath == src_dir:
> break
> try:
> os.rmdir(dirpath)
> except OSError as ex:
> print(ex)
>
>
> def main():
> get_files(src_dir)
> del_dirs(src_dir)
>
Your description says "monitor". That implies to me an ongoing process,
or a loop. You probably want something like:
def main():
while True:
process_files(directory_name)
sleep(10000)
>
> if __name__ == "__main__":
> main()
>
>
> [/code]
>
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2015-03-29 19:00 -0400 |
| Message-ID | <mailman.321.1427670045.10327.python-list@python.org> |
| In reply to | #88262 |
On Sun, 29 Mar 2015 04:32:48 -0700 (PDT), Saran Ahluwalia
<ahlusar.ahluwalia@gmail.com> declaimed the following:
>Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
>
>* Monitors a folder for files that are dropped throughout the day
>
Other than spinning over a time.sleep() call, this is likely going to
be OS specific:
OS may provide a call-back/notification of when a directory is changed
OS may provide a scheduled task capability that would let your task
start at regular intervals -- you'd have to maintain some file identifying
the last processed time stamp (maybe). Linux/UNIX cron jobs, Windows task
scheduler
>* When a file is dropped in the folder the program should scan the file
>
What happens if the file is still being written? Do you record the new
file in one pass, then delay and process those files the next pass
(recording new files for later processing)?
>o IF all the records in the file have the same length
>
>o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
>
Text file? or maybe using the logging system?
>o IF the file is empty OR the records are not all of the same length
>
>o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).
How do you intend such a report to be determined? That is, what defines
the "correct length" of the lines? The first record read? If that one is
short, you're now reporting ALL OTHER lines as erroneous. Read all lines
building a histogram of lengths and taking the mode, then reprocessing to
flag lines that do not meet the mode?
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Paul Rubin <no.email@nospam.invalid> |
|---|---|
| Date | 2015-03-29 18:08 -0700 |
| Message-ID | <87a8yvs34u.fsf@jester.gateway.pace.com> |
| In reply to | #88262 |
Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: > cross-platform... > * Monitors a folder for files that are dropped throughout the day I don't see a cross-platform way to do that other than by waking up and scanning the folder every so often (once a minute, say). The Linux way is with inotify and there's a Python module for it (search terms: python inotify). There might be comparable but non-identical interfaces for other platforms.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-03-30 13:04 +1100 |
| Message-ID | <mailman.323.1427681070.10327.python-list@python.org> |
| In reply to | #88299 |
On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote: > Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: >> cross-platform... >> * Monitors a folder for files that are dropped throughout the day > > I don't see a cross-platform way to do that other than by waking up and > scanning the folder every so often (once a minute, say). The Linux way > is with inotify and there's a Python module for it (search terms: python > inotify). There might be comparable but non-identical interfaces for > other platforms. All too often, "cross-platform" means probing for one option, then another, then another, and using whichever one you can. On Windows, there's FindFirstChangeNotification and ReadDirectoryChanges, which Tim Golden wrote about, and which I coded up into a teleporter for getting files out of a VM automatically: http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html https://github.com/Rosuav/shed/blob/master/senddir.py ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-03-30 09:45 -0700 |
| Message-ID | <bc19cb8e-9e4f-48a7-a9f5-4b39db09a7ce@googlegroups.com> |
| In reply to | #88302 |
On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote: > On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote: > > Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: > >> cross-platform... > >> * Monitors a folder for files that are dropped throughout the day > > > > I don't see a cross-platform way to do that other than by waking up and > > scanning the folder every so often (once a minute, say). The Linux way > > is with inotify and there's a Python module for it (search terms: python > > inotify). There might be comparable but non-identical interfaces for > > other platforms. > > All too often, "cross-platform" means probing for one option, then > another, then another, and using whichever one you can. On Windows, > there's FindFirstChangeNotification and ReadDirectoryChanges, which > Tim Golden wrote about, and which I coded up into a teleporter for > getting files out of a VM automatically: > > http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html > https://github.com/Rosuav/shed/blob/master/senddir.py > > ChrisA @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py I welcome your thoughts on this
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-30 14:35 -0400 |
| Message-ID | <mailman.344.1427740538.10327.python-list@python.org> |
| In reply to | #88334 |
On 03/30/2015 12:45 PM, Saran A wrote: > On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote: >> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote: >>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: >>>> cross-platform... >>>> * Monitors a folder for files that are dropped throughout the day >>> >>> I don't see a cross-platform way to do that other than by waking up and >>> scanning the folder every so often (once a minute, say). The Linux way >>> is with inotify and there's a Python module for it (search terms: python >>> inotify). There might be comparable but non-identical interfaces for >>> other platforms. >> >> All too often, "cross-platform" means probing for one option, then >> another, then another, and using whichever one you can. On Windows, >> there's FindFirstChangeNotification and ReadDirectoryChanges, which >> Tim Golden wrote about, and which I coded up into a teleporter for >> getting files out of a VM automatically: >> >> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html >> https://github.com/Rosuav/shed/blob/master/senddir.py >> >> ChrisA > > @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py > > I welcome your thoughts on this > It's missing a number of your requirements. But it's a start. If it were my file, I'd have a TODO comment at the bottom stating known changes that are needed. In it, I'd mention: 1) your present code is assuming all filenames come directly from the commandline. No searching of a directory. 2) your present code does not move any files to success or failure directories 3) your present code doesn't calculate or write to a text file any statistics. 4) your present code runs once through the names, and terminates. It doesn't "monitor" anything. 5) your present code doesn't check for zero-length files I'd also wonder why you bother checking whether the os.path.getsize(file) function returns the same value as the os.SEEK_END and ftell() code does. Is it that you don't trust the library? Or that you have to run on Windows, where the line-ending logic can change the apparent file size? I notice you're not specifying a file mode on the open. So in Python 3, your sizes are going to be specified in unicode characters after decoding. Is that what the spec says? It's probably safer to explicitly specify the mode (and the file encoding if you're in text). I see you call strip() before comparing the length. Could there ever be leading or trailing whitespace that's significant? Is that the actual specification of line size? -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-03-31 04:00 -0700 |
| Message-ID | <0623f75c-bf93-4a0f-9d87-86986185cdc3@googlegroups.com> |
| In reply to | #88339 |
On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote: > On 03/30/2015 12:45 PM, Saran A wrote: > > On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote: > >> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote: > >>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: > >>>> cross-platform... > >>>> * Monitors a folder for files that are dropped throughout the day > >>> > >>> I don't see a cross-platform way to do that other than by waking up and > >>> scanning the folder every so often (once a minute, say). The Linux way > >>> is with inotify and there's a Python module for it (search terms: python > >>> inotify). There might be comparable but non-identical interfaces for > >>> other platforms. > >> > >> All too often, "cross-platform" means probing for one option, then > >> another, then another, and using whichever one you can. On Windows, > >> there's FindFirstChangeNotification and ReadDirectoryChanges, which > >> Tim Golden wrote about, and which I coded up into a teleporter for > >> getting files out of a VM automatically: > >> > >> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html > >> https://github.com/Rosuav/shed/blob/master/senddir.py > >> > >> ChrisA > > > > @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py > > > > I welcome your thoughts on this > > > > It's missing a number of your requirements. But it's a start. > > If it were my file, I'd have a TODO comment at the bottom stating known > changes that are needed. In it, I'd mention: > > 1) your present code is assuming all filenames come directly from the > commandline. No searching of a directory. > > 2) your present code does not move any files to success or failure > directories > > 3) your present code doesn't calculate or write to a text file any > statistics. > > 4) your present code runs once through the names, and terminates. It > doesn't "monitor" anything. > > 5) your present code doesn't check for zero-length files > > I'd also wonder why you bother checking whether the > os.path.getsize(file) function returns the same value as the os.SEEK_END > and ftell() code does. Is it that you don't trust the library? Or that > you have to run on Windows, where the line-ending logic can change the > apparent file size? > > I notice you're not specifying a file mode on the open. So in Python 3, > your sizes are going to be specified in unicode characters after > decoding. Is that what the spec says? It's probably safer to > explicitly specify the mode (and the file encoding if you're in text). > > I see you call strip() before comparing the length. Could there ever be > leading or trailing whitespace that's significant? Is that the actual > specification of line size? > > -- > DaveA @ Dave A On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote: > On 03/30/2015 12:45 PM, Saran A wrote: > > On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote: > >> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote: > >>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes: > >>>> cross-platform... > >>>> * Monitors a folder for files that are dropped throughout the day > >>> > >>> I don't see a cross-platform way to do that other than by waking up and > >>> scanning the folder every so often (once a minute, say). The Linux way > >>> is with inotify and there's a Python module for it (search terms: python > >>> inotify). There might be comparable but non-identical interfaces for > >>> other platforms. > >> > >> All too often, "cross-platform" means probing for one option, then > >> another, then another, and using whichever one you can. On Windows, > >> there's FindFirstChangeNotification and ReadDirectoryChanges, which > >> Tim Golden wrote about, and which I coded up into a teleporter for > >> getting files out of a VM automatically: > >> > >> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html > >> https://github.com/Rosuav/shed/blob/master/senddir.py > >> > >> ChrisA > > > > @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py > > > > I welcome your thoughts on this > > > > It's missing a number of your requirements. But it's a start. > > If it were my file, I'd have a TODO comment at the bottom stating known > changes that are needed. In it, I'd mention: > > 1) your present code is assuming all filenames come directly from the > commandline. No searching of a directory. > > 2) your present code does not move any files to success or failure > directories > > 3) your present code doesn't calculate or write to a text file any > statistics. > > 4) your present code runs once through the names, and terminates. It > doesn't "monitor" anything. > > 5) your present code doesn't check for zero-length files > > I'd also wonder why you bother checking whether the > os.path.getsize(file) function returns the same value as the os.SEEK_END > and ftell() code does. Is it that you don't trust the library? Or that > you have to run on Windows, where the line-ending logic can change the > apparent file size? > > I notice you're not specifying a file mode on the open. So in Python 3, > your sizes are going to be specified in unicode characters after > decoding. Is that what the spec says? It's probably safer to > explicitly specify the mode (and the file encoding if you're in text). > > I see you call strip() before comparing the length. Could there ever be > leading or trailing whitespace that's significant? Is that the actual > specification of line size? > > -- > DaveA @DaveA: This is a homework assignment. As someone is not familiar (I have only been programming for 8 months) with best practice for writing to files, appending to folders and searching a directory. Is it possible that you could provide me with some snippets or guidance on where to place your suggestions (for your TO DOs 2,3,4,5)? I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice. Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-03-31 09:19 -0400 |
| Message-ID | <mailman.371.1427807964.10327.python-list@python.org> |
| In reply to | #88372 |
On 03/31/2015 07:00 AM, Saran A wrote:
> @DaveA: This is a homework assignment. .... Is it possible that you
could provide me with some snippets or guidance on where to place your
suggestions (for your TO DOs 2,3,4,5)?
>
> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
>>
>> It's missing a number of your requirements. But it's a start.
>>
>> If it were my file, I'd have a TODO comment at the bottom stating known
>> changes that are needed. In it, I'd mention:
>>
>> 1) your present code is assuming all filenames come directly from the
>> commandline. No searching of a directory.
>>
>> 2) your present code does not move any files to success or failure
>> directories
>>
In function validate_files()
Just after the line
print('success with %s on %d reco...
you could move the file, using shutil. Likewise after the failure print.
>> 3) your present code doesn't calculate or write to a text file any
>> statistics.
You successfully print to sys.stderr. So you could print to some other
file in the exact same way.
>>
>> 4) your present code runs once through the names, and terminates. It
>> doesn't "monitor" anything.
Make a new function, perhaps called main(), with a loop that calls
validate_files(), with a sleep after each pass. Of course, unless you
fix TODO#1, that'll keep looking for the same files. No harm in that if
that's the spec, since you moved the earlier versions of the files.
But if you want to "monitor" the directory, let the directory name be
the argument to main, and let main do a dirlist each time through the
loop, and pass the corresponding list to validate_files.
>>
>> 5) your present code doesn't check for zero-length files
>>
In validate_and_process_data(), instead of checking filesize against
ftell, check it against zero.
>> I'd also wonder why you bother checking whether the
>> os.path.getsize(file) function returns the same value as the os.SEEK_END
>> and ftell() code does. Is it that you don't trust the library? Or that
>> you have to run on Windows, where the line-ending logic can change the
>> apparent file size?
>>
>> I notice you're not specifying a file mode on the open. So in Python 3,
>> your sizes are going to be specified in unicode characters after
>> decoding. Is that what the spec says? It's probably safer to
>> explicitly specify the mode (and the file encoding if you're in text).
>>
>> I see you call strip() before comparing the length. Could there ever be
>> leading or trailing whitespace that's significant? Is that the actual
>> specification of line size?
>>
>> --
>> DaveA
>
>
> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
>
> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
>
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-04-01 06:43 -0700 |
| Message-ID | <fcd8bbbc-2d81-4797-aa0e-090683e72f3f@googlegroups.com> |
| In reply to | #88381 |
On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
> On 03/31/2015 07:00 AM, Saran A wrote:
>
> > @DaveA: This is a homework assignment. .... Is it possible that you
> could provide me with some snippets or guidance on where to place your
> suggestions (for your TO DOs 2,3,4,5)?
> >
>
>
> > On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
>
> >>
> >> It's missing a number of your requirements. But it's a start.
> >>
> >> If it were my file, I'd have a TODO comment at the bottom stating known
> >> changes that are needed. In it, I'd mention:
> >>
> >> 1) your present code is assuming all filenames come directly from the
> >> commandline. No searching of a directory.
> >>
> >> 2) your present code does not move any files to success or failure
> >> directories
> >>
>
> In function validate_files()
> Just after the line
> print('success with %s on %d reco...
> you could move the file, using shutil. Likewise after the failure print.
>
> >> 3) your present code doesn't calculate or write to a text file any
> >> statistics.
>
> You successfully print to sys.stderr. So you could print to some other
> file in the exact same way.
>
> >>
> >> 4) your present code runs once through the names, and terminates. It
> >> doesn't "monitor" anything.
>
> Make a new function, perhaps called main(), with a loop that calls
> validate_files(), with a sleep after each pass. Of course, unless you
> fix TODO#1, that'll keep looking for the same files. No harm in that if
> that's the spec, since you moved the earlier versions of the files.
>
> But if you want to "monitor" the directory, let the directory name be
> the argument to main, and let main do a dirlist each time through the
> loop, and pass the corresponding list to validate_files.
>
> >>
> >> 5) your present code doesn't check for zero-length files
> >>
>
> In validate_and_process_data(), instead of checking filesize against
> ftell, check it against zero.
>
> >> I'd also wonder why you bother checking whether the
> >> os.path.getsize(file) function returns the same value as the os.SEEK_END
> >> and ftell() code does. Is it that you don't trust the library? Or that
> >> you have to run on Windows, where the line-ending logic can change the
> >> apparent file size?
> >>
> >> I notice you're not specifying a file mode on the open. So in Python 3,
> >> your sizes are going to be specified in unicode characters after
> >> decoding. Is that what the spec says? It's probably safer to
> >> explicitly specify the mode (and the file encoding if you're in text).
> >>
> >> I see you call strip() before comparing the length. Could there ever be
> >> leading or trailing whitespace that's significant? Is that the actual
> >> specification of line size?
> >>
> >> --
> >> DaveA
> >
> >
>
> > I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
> >
> > Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
> >
>
>
> --
> DaveA
@DaveA
My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.
I have attempted to address the functional requirements that you brought up:
1) Before, my present code was assuming all filenames come directly from the commandline. No searching of a directory. I think that I have addressed this.
2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.
3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error. (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)
4) Before, my present code runs once through the names, and terminates. It doesn't "monitor" anything. I think I have addressed this with the main function - correct?
5) Before, my present code doesn't check for zero-length files - I have added a comment there in case that is needed)
I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
Sincerely,
Saran
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-04-01 19:51 -0400 |
| Message-ID | <mailman.16.1427932335.12925.python-list@python.org> |
| In reply to | #88423 |
On 04/01/2015 09:43 AM, Saran A wrote:
> On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
>> On 03/31/2015 07:00 AM, Saran A wrote:
>>
>> > @DaveA: This is a homework assignment. .... Is it possible that you
>> could provide me with some snippets or guidance on where to place your
>> suggestions (for your TO DOs 2,3,4,5)?
>> >
>>
>>
>>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
>>
>>>>
>>>> It's missing a number of your requirements. But it's a start.
>>>>
>>>> If it were my file, I'd have a TODO comment at the bottom stating known
>>>> changes that are needed. In it, I'd mention:
>>>>
>>>> 1) your present code is assuming all filenames come directly from the
>>>> commandline. No searching of a directory.
>>>>
>>>> 2) your present code does not move any files to success or failure
>>>> directories
>>>>
>>
>> In function validate_files()
>> Just after the line
>> print('success with %s on %d reco...
>> you could move the file, using shutil. Likewise after the failure print.
>>
>>>> 3) your present code doesn't calculate or write to a text file any
>>>> statistics.
>>
>> You successfully print to sys.stderr. So you could print to some other
>> file in the exact same way.
>>
>>>>
>>>> 4) your present code runs once through the names, and terminates. It
>>>> doesn't "monitor" anything.
>>
>> Make a new function, perhaps called main(), with a loop that calls
>> validate_files(), with a sleep after each pass. Of course, unless you
>> fix TODO#1, that'll keep looking for the same files. No harm in that if
>> that's the spec, since you moved the earlier versions of the files.
>>
>> But if you want to "monitor" the directory, let the directory name be
>> the argument to main, and let main do a dirlist each time through the
>> loop, and pass the corresponding list to validate_files.
>>
>>>>
>>>> 5) your present code doesn't check for zero-length files
>>>>
>>
>> In validate_and_process_data(), instead of checking filesize against
>> ftell, check it against zero.
>>
>>>> I'd also wonder why you bother checking whether the
>>>> os.path.getsize(file) function returns the same value as the os.SEEK_END
>>>> and ftell() code does. Is it that you don't trust the library? Or that
>>>> you have to run on Windows, where the line-ending logic can change the
>>>> apparent file size?
>>>>
>>>> I notice you're not specifying a file mode on the open. So in Python 3,
>>>> your sizes are going to be specified in unicode characters after
>>>> decoding. Is that what the spec says? It's probably safer to
>>>> explicitly specify the mode (and the file encoding if you're in text).
>>>>
>>>> I see you call strip() before comparing the length. Could there ever be
>>>> leading or trailing whitespace that's significant? Is that the actual
>>>> specification of line size?
>>>>
>>>> --
>>>> DaveA
>>>
>>>
>>
>>> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
>>>
>>> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
>>>
>>
>>
>> --
>> DaveA
>
> @DaveA
>
> My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.
Perhaps you don't realize how github works. The whole point is it
preserves the history of your code, and you use the same filename for
each revision.
Or possibly it's I that doesn't understand it. I use git, but haven't
actually used github for my own code.
>
> I have attempted to address the functional requirements that you brought up:
>
> 1) Before, my present code was assuming all filenames come directly from the commandline. No searching of a directory. I think that I have addressed this.
>
Have you even tried to run the code? It quits immediately with an
exception since your call to main() doesn't pass any arguments, and main
requires one.
> def main(dirslist):
> while True:
> for file in dirslist:
> return validate_files(file)
> time.sleep(5)
In addition, you aren't actually doing anything to find what the files
in the directory are. I tried to refer to dirlist, as a hint. A
stronger hint: look up os.listdir()
And that list of files has to change each time through the while loop,
that's the whole meaning of scanning. You don't just grab the names
once, you look to see what's there.
The next thing is that you're using a variable called 'file', while
that's a built-in type in Python. So you really want to use a different
name.
Next, you have a loop through the magical dirslist to get individual
filenames, but then you call the validate_files() function with a single
file, but that function is expecting a list of filenames. One or the
other has to change.
Next, you return from main after validating the first file, so no others
will get processed.
> 2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.
You had a function to do that in your very first post to this thread.
Have you forgotten it already? As for syntax errors, you can't expect
any help on that when you don't show any code nor the syntax error
traceback. Remember that when a syntax error is shown, it's frequently
the previous line or two that actually was wrong.
>
> 3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error.
> #I wrote an error class in case the I needed such a specification for
a notifcation - this is not in the requirements
> class ErrorHandler:
>
> def __init__(self):
> pass
>
> def write(self, string):
>
> # write error to file
> fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p")
+ ").txt"
> handler = open(fname, "w")
> handler.write(string)
> handler.close()
This looks like Java code. With no data attributes, and only one
method(function), what's the reason you don't just use a function?
original_stderr = sys.stderr # stored in case want to revert
sys.stderr = ErrorHandler()
This is just bogus. There are rare times when a programmer might want
to replace stderr, but that's just unnecessarily confusing in a program
like this one. When you want to write printable data to another file,
just use that file's handle as the argument to the file= argument of
print(). You could use an instance of ErrorHandler as a file handle.
Or do it one of a dozen other straightforward ways.
While we're at it, why do you keep creating new files for your error
report? And overwriting all the old messages with whatever new ones
come in the same minute?
> (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)
No idea what you mean here. What metadata? And what do copy and copy2
have to do with anything here?
>
> 4) Before, my present code runs once through the names, and terminates. It doesn't "monitor" anything. I think I have addressed this with the main function - correct?
>
Not even close.
> 5) Before, my present code doesn't check for zero-length files - I have added a comment there in case that is needed)
>
> I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
>
> Sincerely,
>
> Saran
>
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-04-02 06:06 -0700 |
| Message-ID | <ba29c06d-3cdc-412e-90a0-b1119a360570@googlegroups.com> |
| In reply to | #88437 |
On Wednesday, April 1, 2015 at 7:52:27 PM UTC-4, Dave Angel wrote:
> On 04/01/2015 09:43 AM, Saran A wrote:
> > On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
> >> On 03/31/2015 07:00 AM, Saran A wrote:
> >>
> >> > @DaveA: This is a homework assignment. .... Is it possible that you
> >> could provide me with some snippets or guidance on where to place your
> >> suggestions (for your TO DOs 2,3,4,5)?
> >> >
> >>
> >>
> >>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
> >>
> >>>>
> >>>> It's missing a number of your requirements. But it's a start.
> >>>>
> >>>> If it were my file, I'd have a TODO comment at the bottom stating known
> >>>> changes that are needed. In it, I'd mention:
> >>>>
> >>>> 1) your present code is assuming all filenames come directly from the
> >>>> commandline. No searching of a directory.
> >>>>
> >>>> 2) your present code does not move any files to success or failure
> >>>> directories
> >>>>
> >>
> >> In function validate_files()
> >> Just after the line
> >> print('success with %s on %d reco...
> >> you could move the file, using shutil. Likewise after the failure print.
> >>
> >>>> 3) your present code doesn't calculate or write to a text file any
> >>>> statistics.
> >>
> >> You successfully print to sys.stderr. So you could print to some other
> >> file in the exact same way.
> >>
> >>>>
> >>>> 4) your present code runs once through the names, and terminates. It
> >>>> doesn't "monitor" anything.
> >>
> >> Make a new function, perhaps called main(), with a loop that calls
> >> validate_files(), with a sleep after each pass. Of course, unless you
> >> fix TODO#1, that'll keep looking for the same files. No harm in that if
> >> that's the spec, since you moved the earlier versions of the files.
> >>
> >> But if you want to "monitor" the directory, let the directory name be
> >> the argument to main, and let main do a dirlist each time through the
> >> loop, and pass the corresponding list to validate_files.
> >>
> >>>>
> >>>> 5) your present code doesn't check for zero-length files
> >>>>
> >>
> >> In validate_and_process_data(), instead of checking filesize against
> >> ftell, check it against zero.
> >>
> >>>> I'd also wonder why you bother checking whether the
> >>>> os.path.getsize(file) function returns the same value as the os.SEEK_END
> >>>> and ftell() code does. Is it that you don't trust the library? Or that
> >>>> you have to run on Windows, where the line-ending logic can change the
> >>>> apparent file size?
> >>>>
> >>>> I notice you're not specifying a file mode on the open. So in Python 3,
> >>>> your sizes are going to be specified in unicode characters after
> >>>> decoding. Is that what the spec says? It's probably safer to
> >>>> explicitly specify the mode (and the file encoding if you're in text).
> >>>>
> >>>> I see you call strip() before comparing the length. Could there ever be
> >>>> leading or trailing whitespace that's significant? Is that the actual
> >>>> specification of line size?
> >>>>
> >>>> --
> >>>> DaveA
> >>>
> >>>
> >>
> >>> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
> >>>
> >>> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
> >>>
> >>
> >>
> >> --
> >> DaveA
> >
> > @DaveA
> >
> > My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.
>
> Perhaps you don't realize how github works. The whole point is it
> preserves the history of your code, and you use the same filename for
> each revision.
>
> Or possibly it's I that doesn't understand it. I use git, but haven't
> actually used github for my own code.
>
> >
> > I have attempted to address the functional requirements that you brought up:
> >
> > 1) Before, my present code was assuming all filenames come directly from the commandline. No searching of a directory. I think that I have addressed this.
> >
>
> Have you even tried to run the code? It quits immediately with an
> exception since your call to main() doesn't pass any arguments, and main
> requires one.
>
> > def main(dirslist):
> > while True:
> > for file in dirslist:
> > return validate_files(file)
> > time.sleep(5)
>
> In addition, you aren't actually doing anything to find what the files
> in the directory are. I tried to refer to dirlist, as a hint. A
> stronger hint: look up os.listdir()
>
> And that list of files has to change each time through the while loop,
> that's the whole meaning of scanning. You don't just grab the names
> once, you look to see what's there.
>
> The next thing is that you're using a variable called 'file', while
> that's a built-in type in Python. So you really want to use a different
> name.
>
> Next, you have a loop through the magical dirslist to get individual
> filenames, but then you call the validate_files() function with a single
> file, but that function is expecting a list of filenames. One or the
> other has to change.
>
> Next, you return from main after validating the first file, so no others
> will get processed.
>
> > 2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.
>
> You had a function to do that in your very first post to this thread.
> Have you forgotten it already? As for syntax errors, you can't expect
> any help on that when you don't show any code nor the syntax error
> traceback. Remember that when a syntax error is shown, it's frequently
> the previous line or two that actually was wrong.
>
> >
> > 3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error.
>
> > #I wrote an error class in case the I needed such a specification for
> a notifcation - this is not in the requirements
> > class ErrorHandler:
> >
> > def __init__(self):
> > pass
> >
> > def write(self, string):
> >
> > # write error to file
> > fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p")
> + ").txt"
> > handler = open(fname, "w")
> > handler.write(string)
> > handler.close()
>
>
>
>
> This looks like Java code. With no data attributes, and only one
> method(function), what's the reason you don't just use a function?
>
>
> original_stderr = sys.stderr # stored in case want to revert
> sys.stderr = ErrorHandler()
>
> This is just bogus. There are rare times when a programmer might want
> to replace stderr, but that's just unnecessarily confusing in a program
> like this one. When you want to write printable data to another file,
> just use that file's handle as the argument to the file= argument of
> print(). You could use an instance of ErrorHandler as a file handle.
> Or do it one of a dozen other straightforward ways.
>
> While we're at it, why do you keep creating new files for your error
> report? And overwriting all the old messages with whatever new ones
> come in the same minute?
>
>
> > (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)
>
> No idea what you mean here. What metadata? And what do copy and copy2
> have to do with anything here?
>
> >
> > 4) Before, my present code runs once through the names, and terminates. It doesn't "monitor" anything. I think I have addressed this with the main function - correct?
> >
> Not even close.
>
> > 5) Before, my present code doesn't check for zero-length files - I have added a comment there in case that is needed)
> >
> > I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
> >
> > Sincerely,
> >
> > Saran
> >
>
>
> --
> DaveA
@DaveA
Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again.
from __future__ import print_function
import os
import time
import glob
import sys
def initialize_logger(output_dir):
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
# create console handler and set level to info
handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
# create error file handler and set level to error
handler = logging.FileHandler(os.path.join(output_dir, "error.log"),"w", encoding=None, delay="true")
handler.setLevel(logging.ERROR)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
# create debug file handler and set level to debug
handler = logging.FileHandler(os.path.join(output_dir, "all.log"),"w")
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
#Helper Functions for the Success and Failure Folder Outcomes, respectively
def file_len(filename):
with open(filename) as f:
for i, l in enumerate(f):
pass
return i + 1
def copy_and_move_File(src, dest):
try:
shutil.rename(src, dest)
# eg. src and dest are the same file
except shutil.Error as e:
print('Error: %s' % e)
# eg. source or destination doesn't exist
except IOError as e:
print('Error: %s' % e.strerror)
# Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline. There was no actual searching #of a directory.
# I am assuming that this is appropriate since I moved the earlier versions of the files.
# I let the directory name be the argument to main, and let main do a dirlist each time through the loop,
# and pass the corresponding list to validate_files.
path = "/some/sample/path/"
dirlist = os.listdir(path)
before = dict([(f, None) for f in dirlist)
#####Syntax Error? before = dict([(f, None) for f in dirlist)
^
SyntaxError: invalid syntax
def main(dirlist):
while True:
time.sleep(10) #time between update check
after = dict([(f, None) for f in dirlist)
added = [f for f in after if not f in before]
if added:
print('Sucessfully added new file - ready to validate')
####add return statement here to pass to validate_files
if __name__ == "__main__":
main()
#check for record time and record length - logic to be written to either pass to Failure or Success folder respectively
def validate_files():
creation = time.ctime(os.path.getctime(added))
lastmod = time.ctime(os.path.getmtime(added))
#Potential Additions/Substitutions - what are the implications/consequences for this
def move_to_failure_folder_and_return_error_file():
os.mkdir('Failure')
copy_and_move_File(filename, 'Failure')
initialize_logger('rootdir/Failure')
logging.error("Either this file is empty or there are no lines")
def move_to_success_folder_and_read(f):
os.mkdir('Success')
copy_and_move_File(filename, 'Success')
print("Success", f)
return file_len()
#This simply checks the file information by name------> is this needed anymore?
def fileinfo(file):
filename = os.path.basename(f)
rootdir = os.path.dirname(f)
filesize = os.path.getsize(f)
return filename, rootdir, filesize
if __name__ == '__main__':
import sys
validate_files(sys.argv[1:])
# -- end of file
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2015-04-02 17:10 -0400 |
| Message-ID | <mailman.23.1428009062.12925.python-list@python.org> |
| In reply to | #88448 |
On 04/02/2015 09:06 AM, Saran A wrote:
>
> Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again.
>
> from __future__ import print_function
I'll just randomly comment on some things I see here. You've started
several threads, on two different forums, so it's impractical to figure
out what's really up.
<snip some code I'm not commenting on>
>
> #Helper Functions for the Success and Failure Folder Outcomes, respectively
>
> def file_len(filename):
This is an indentation error, as you forgot to start at the left margin
> with open(filename) as f:
> for i, l in enumerate(f):
> pass
> return i + 1
>
>
> def copy_and_move_File(src, dest):
ditto
> try:
> shutil.rename(src, dest)
Is there a reason you don't use the move function? rename won't work if
the two directories aren't on the same file system.
> # eg. src and dest are the same file
> except shutil.Error as e:
> print('Error: %s' % e)
> # eg. source or destination doesn't exist
> except IOError as e:
> print('Error: %s' % e.strerror)
>
>
> # Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline. There was no actual searching #of a directory.
>
> # I am assuming that this is appropriate since I moved the earlier versions of the files.
> # I let the directory name be the argument to main, and let main do a dirlist each time through the loop,
> # and pass the corresponding list to validate_files.
>
>
> path = "/some/sample/path/"
> dirlist = os.listdir(path)
> before = dict([(f, None) for f in dirlist)
>
> #####Syntax Error? before = dict([(f, None) for f in dirlist)
> ^
> SyntaxError: invalid syntax
Look at the line in question. There's an unmatched set of brackets. Not
that it matters, since you don't need these 2 lines for anything. See
my comments on some other forum.
>
> def main(dirlist):
bad name for a directory path variable.
> while True:
> time.sleep(10) #time between update check
Somewhere inside this loop, you want to obtain a list of files in the
specified directory. And you want to do something with that list. You
don't have to worry about what the files were last time, because
presumably those are gone. Unless in an unwritten part of the spec,
you're supposed to abort if any filename is repeated over time.
> after = dict([(f, None) for f in dirlist)
> added = [f for f in after if not f in before]
> if added:
> print('Sucessfully added new file - ready to validate')
> ####add return statement here to pass to validate_files
> if __name__ == "__main__":
> main()
You'll need an argument to call main()
>
>
> #check for record time and record length - logic to be written to either pass to Failure or Success folder respectively
>
> def validate_files():
Where are all the parameters to this function?
> creation = time.ctime(os.path.getctime(added))
> lastmod = time.ctime(os.path.getmtime(added))
>
>
>
> #Potential Additions/Substitutions - what are the implications/consequences for this
>
> def move_to_failure_folder_and_return_error_file():
> os.mkdir('Failure')
> copy_and_move_File(filename, 'Failure')
> initialize_logger('rootdir/Failure')
> logging.error("Either this file is empty or there are no lines")
>
>
> def move_to_success_folder_and_read(f):
> os.mkdir('Success')
> copy_and_move_File(filename, 'Success')
> print("Success", f)
> return file_len()
>
> #This simply checks the file information by name------> is this needed anymore?
>
> def fileinfo(file):
> filename = os.path.basename(f)
> rootdir = os.path.dirname(f)
> filesize = os.path.getsize(f)
> return filename, rootdir, filesize
>
> if __name__ == '__main__':
> import sys
> validate_files(sys.argv[1:])
>
> # -- end of file
>
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-04-02 16:43 -0700 |
| Message-ID | <4d3a3336-5dc1-4fe8-8ef6-1616c6629d01@googlegroups.com> |
| In reply to | #88460 |
On Thursday, April 2, 2015 at 5:11:20 PM UTC-4, Dave Angel wrote:
> On 04/02/2015 09:06 AM, Saran A wrote:
>
> >
> > Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again.
> >
> > from __future__ import print_function
>
> I'll just randomly comment on some things I see here. You've started
> several threads, on two different forums, so it's impractical to figure
> out what's really up.
>
>
> <snip some code I'm not commenting on>
> >
> > #Helper Functions for the Success and Failure Folder Outcomes, respectively
> >
> > def file_len(filename):
>
> This is an indentation error, as you forgot to start at the left margin
>
> > with open(filename) as f:
> > for i, l in enumerate(f):
> > pass
> > return i + 1
> >
> >
> > def copy_and_move_File(src, dest):
>
> ditto
>
> > try:
> > shutil.rename(src, dest)
>
> Is there a reason you don't use the move function? rename won't work if
> the two directories aren't on the same file system.
>
> > # eg. src and dest are the same file
> > except shutil.Error as e:
> > print('Error: %s' % e)
> > # eg. source or destination doesn't exist
> > except IOError as e:
> > print('Error: %s' % e.strerror)
> >
> >
> > # Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline. There was no actual searching #of a directory.
> >
> > # I am assuming that this is appropriate since I moved the earlier versions of the files.
> > # I let the directory name be the argument to main, and let main do a dirlist each time through the loop,
> > # and pass the corresponding list to validate_files.
> >
> >
> > path = "/some/sample/path/"
> > dirlist = os.listdir(path)
> > before = dict([(f, None) for f in dirlist)
> >
> > #####Syntax Error? before = dict([(f, None) for f in dirlist)
> > ^
> > SyntaxError: invalid syntax
>
> Look at the line in question. There's an unmatched set of brackets. Not
> that it matters, since you don't need these 2 lines for anything. See
> my comments on some other forum.
>
> >
> > def main(dirlist):
>
> bad name for a directory path variable.
>
> > while True:
> > time.sleep(10) #time between update check
>
> Somewhere inside this loop, you want to obtain a list of files in the
> specified directory. And you want to do something with that list. You
> don't have to worry about what the files were last time, because
> presumably those are gone. Unless in an unwritten part of the spec,
> you're supposed to abort if any filename is repeated over time.
>
>
> > after = dict([(f, None) for f in dirlist)
> > added = [f for f in after if not f in before]
> > if added:
> > print('Sucessfully added new file - ready to validate')
> > ####add return statement here to pass to validate_files
> > if __name__ == "__main__":
> > main()
>
> You'll need an argument to call main()
>
> >
> >
> > #check for record time and record length - logic to be written to either pass to Failure or Success folder respectively
> >
> > def validate_files():
>
> Where are all the parameters to this function?
>
> > creation = time.ctime(os.path.getctime(added))
> > lastmod = time.ctime(os.path.getmtime(added))
> >
> >
> >
> > #Potential Additions/Substitutions - what are the implications/consequences for this
> >
> > def move_to_failure_folder_and_return_error_file():
> > os.mkdir('Failure')
> > copy_and_move_File(filename, 'Failure')
> > initialize_logger('rootdir/Failure')
> > logging.error("Either this file is empty or there are no lines")
> >
> >
> > def move_to_success_folder_and_read(f):
> > os.mkdir('Success')
> > copy_and_move_File(filename, 'Success')
> > print("Success", f)
> > return file_len()
> >
> > #This simply checks the file information by name------> is this needed anymore?
> >
> > def fileinfo(file):
> > filename = os.path.basename(f)
> > rootdir = os.path.dirname(f)
> > filesize = os.path.getsize(f)
> > return filename, rootdir, filesize
> >
> > if __name__ == '__main__':
> > import sys
> > validate_files(sys.argv[1:])
> >
> > # -- end of file
> >
>
>
> --
> DaveA
@DaveA
I debugged and rewrote everything. Here is the full version. Feel free to tear this apart. The homework assignment is not due until tomorrow, so I am currently also experimenting with pyinotify as well. I do have questions regarding how to make this function compatible with the ProcessEvent Class. I will create another post for this.
What would you advise in regards to renaming the inaptly named dirlist?
# # # Without data to examine here, I can only guess based on this requirement's language that
# # fixed records are in the input. If so, here's a slight revision to the helper functions that I wrote earlier which
# # takes the function fileinfo as a starting point and demonstrates calling a function from within a function.
# I tested this little sample on a small set of files created with MD5 checksums. I wrote the Python in such a way as it
# would work with Python 2.x or 3.x (note the __future__ at the top).
# # # There are so many wonderful ways of failure, so, from a development standpoint, I would probably spend a bit
# # more time trying to determine which failure(s) I would want to report to the user, and how (perhaps creating my own Exceptions)
# # # The only other comments I would make are about safe-file handling.
# # # #1: Question: After a user has created a file that has failed (in
# # # processing),can the user create a file with the same name?
# # # If so, then you will probably want to look at some sort
# # # of file-naming strategy to avoid overwriting evidence of
# # # earlier failures.
# # # File naming is a tricky thing. I referenced the tempfile module [1] and the Maildir naming scheme to see two different
# # types of solutions to the problem of choosing a unique filename.
## I am assuming that all of my files are going to be specified in unicode
## Utilized Spyder's Scientific Computing IDE to debug, check for indentation errors and test function suite
from __future__ import print_function
import os.path
import time
import logging
def initialize_logger(output_dir):
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
# create console handler and set level to info
handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
# create error file handler and set level to error
handler = logging.FileHandler(os.path.join(output_dir, "error.log"),"w", encoding=None, delay="true")
handler.setLevel(logging.ERROR)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
# create debug file handler and set level to debug
handler = logging.FileHandler(os.path.join(output_dir, "all.log"),"w")
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
#Returns filename, rootdir and filesize
def fileinfo(f):
filename = os.path.basename(f)
rootdir = os.path.dirname(f)
filesize = os.path.getsize(f)
return filename, rootdir, filesize
#returns length of file
def file_len(f):
with open(f) as f:
for i, l in enumerate(f):
pass
return i + 1
#attempts to copy file and move file to it's directory
def copy_and_move_file(src, dest):
try:
os.rename(src, dest)
# eg. src and dest are the same file
except IOError as e:
print('Error: %s' % e.strerror)
path = "."
dirlist = os.listdir(path)
def main(dirlist):
before = dict([(f, 0) for f in dirlist])
while True:
time.sleep(1) #time between update check
after = dict([(f, None) for f in dirlist])
added = [f for f in after if not f in before]
if added:
f = ''.join(added)
print('Sucessfully added %s file - ready to validate') %()
return validate_files(f)
else:
return move_to_failure_folder_and_return_error_file(f)
def validate_files(f):
creation = time.ctime(os.path.getctime(f))
lastmod = time.ctime(os.path.getmtime(f))
if creation == lastmod and file_len(f) > 0:
return move_to_success_folder_and_read(f)
if file_len < 0 and creation != lastmod:
return move_to_success_folder_and_read(f)
else:
return move_to_failure_folder_and_return_error_file(f)
#Potential Additions/Substitutions
def move_to_failure_folder_and_return_error_file():
filename, rootdir, lastmod, creation, filesize = fileinfo(file)
os.mkdir('Failure')
copy_and_move_file( 'Failure')
initialize_logger('rootdir/Failure')
logging.error("Either this file is empty or there are no lines")
def move_to_success_folder_and_read():
filename, rootdir, lastmod, creation, filesize = fileinfo(file)
os.mkdir('Success')
copy_and_move_file(rootdir, 'Success') #file name
print("Success", file)
return file_len(file)
if __name__ == '__main__':
main(dirlist)
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-04-03 12:45 +0200 |
| Message-ID | <mailman.29.1428057967.12925.python-list@python.org> |
| In reply to | #88463 |
Saran A wrote:
> I debugged and rewrote everything. Here is the full version. Feel free to
> tear this apart. The homework assignment is not due until tomorrow, so I
> am currently also experimenting with pyinotify as well.
Saran, try to make a realistic assessment of your capability. Your
"debugged" code still has this gem
> while True:
> time.sleep(1) #time between update check
and you are likely to miss your deadline anyway.
While the inotify approach may be the "right way" to attack this problem
from the point of view of an experienced developer like Chris as a newbie
you should stick to the simplest thing that can possibly work.
In a previous post I mentioned unit tests, but even an ad-hoc test in the
interactive interpreter will show that your file_len() function doesn't work
> #returns length of file
> def file_len(f):
> with open(f) as f:
> for i, l in enumerate(f):
> pass
> return i + 1
>
$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> #returns length of file
... def file_len(f):
... with open(f) as f:
... for i, l in enumerate(f):
... pass
... return i + 1
...
>>> with open("tmp.txt", "w") as f: f.write("abcde")
...
>>> file_len("tmp.txt")
1
It reports size 1 for every file.
A simple unit test would look like this (assuming watchscript.py is the name
of the script to be tested):
import unittest
from watchscript import file_len
class FileLen(unittest.TestCase):
def test_five(self):
LEN = 5
FILENAME = "tmp.txt"
with open(FILENAME, "w") as f:
f.write("*" * LEN)
self.assertEqual(file_len(FILENAME), 5)
if __name__ == "__main__":
unittest.main()
Typically for every tested function you add a new TestCase subclass and for
every test you perform for that function you add a test_...() method to that
class. The unittest.main() will collect these methods, run them and generate
a report. For the above example it will complain:
$ python test_watchscript.py
F
======================================================================
FAIL: test_five (__main__.FileLen)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_watchscript.py", line 12, in test_five
self.assertEqual(file_len(FILENAME), 5)
AssertionError: 1 != 5
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (failures=1)
The file_len() function is but an example of the many problems with your
code. There is no shortcut, you have to decide for every function what
exactly it should do and then verify that it actually does what it is meant
to do. Try to negotiate an extra week or two with your supervisor and then
start small:
On day one make a script that moves all files from one directory to another.
Make sure that the source and destination directory are only stored in one
place in your script so that they can easily be replaced. Run the script
after every change and devise tests that demonstrate that it works
correctly.
On day two make a script that checks all files in a directory. Put the
checks into a function that takes a file path as its only parameter.
Write a few correct and a few broken files and test that only the correct
ones are recognized as correct, and only the broken ones are flagged as
broken, and that all files in the directory are checked.
On day three make a script that combines the above and only moves the
correct files into another directory. Devise tests to verify that both
directories contain the expected files before and after your script is
executed.
On day four make a script that also moves the bad files into another
directory. Modify your tests from the previous day to check the contents of
all three directories.
On day five wrap your previous efforts in a loop that runs forever.
I seems to work? You are not done. Write tests that demonstrate that it does
work. Try to think of the corner cases: what if there are no new files on
one iteration? What if there is a new file with the same name as a previous
one? Don't handwave, describe the reaction of your script in plain English
with all the gory details and then translate into Python.
Heureka!
Note that a "day" in the above outline can be 15 minutes or a week. You're
done when you're done. Also note that the amount of code you have written
bears no indication of how close you are to your goal. An experienced
programmer will end up with less code than you to fulfill the same spec.
That shouldn't bother you. On the other hand you should remove code that
doesn't contribute to your script's goal immediately. If you keep failed
attempts and side tracks your code will become harder and harder to
maintain, and that's the last thing you need when it's already hard to get
the necessary parts right.
Good luck!
[toc] | [prev] | [next] | [standalone]
| From | Saran A <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-04-03 06:40 -0700 |
| Message-ID | <149a7683-15a0-4fa6-882f-4ff8982e9e0c@googlegroups.com> |
| In reply to | #88476 |
On Friday, April 3, 2015 at 6:46:21 AM UTC-4, Peter Otten wrote:
> Saran A wrote:
>
> > I debugged and rewrote everything. Here is the full version. Feel free to
> > tear this apart. The homework assignment is not due until tomorrow, so I
> > am currently also experimenting with pyinotify as well.
>
> Saran, try to make a realistic assessment of your capability. Your
> "debugged" code still has this gem
>
> > while True:
> > time.sleep(1) #time between update check
>
> and you are likely to miss your deadline anyway.
>
> While the inotify approach may be the "right way" to attack this problem
> from the point of view of an experienced developer like Chris as a newbie
> you should stick to the simplest thing that can possibly work.
>
> In a previous post I mentioned unit tests, but even an ad-hoc test in the
> interactive interpreter will show that your file_len() function doesn't work
>
> > #returns length of file
> > def file_len(f):
> > with open(f) as f:
> > for i, l in enumerate(f):
> > pass
> > return i + 1
> >
>
> $ python
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> #returns length of file
> ... def file_len(f):
> ... with open(f) as f:
> ... for i, l in enumerate(f):
> ... pass
> ... return i + 1
> ...
> >>> with open("tmp.txt", "w") as f: f.write("abcde")
> ...
> >>> file_len("tmp.txt")
> 1
>
> It reports size 1 for every file.
>
> A simple unit test would look like this (assuming watchscript.py is the name
> of the script to be tested):
>
> import unittest
>
> from watchscript import file_len
>
> class FileLen(unittest.TestCase):
> def test_five(self):
> LEN = 5
> FILENAME = "tmp.txt"
> with open(FILENAME, "w") as f:
> f.write("*" * LEN)
>
> self.assertEqual(file_len(FILENAME), 5)
>
> if __name__ == "__main__":
> unittest.main()
>
> Typically for every tested function you add a new TestCase subclass and for
> every test you perform for that function you add a test_...() method to that
> class. The unittest.main() will collect these methods, run them and generate
> a report. For the above example it will complain:
>
> $ python test_watchscript.py
> F
> ======================================================================
> FAIL: test_five (__main__.FileLen)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "test_watchscript.py", line 12, in test_five
> self.assertEqual(file_len(FILENAME), 5)
> AssertionError: 1 != 5
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.001s
>
> FAILED (failures=1)
>
>
> The file_len() function is but an example of the many problems with your
> code. There is no shortcut, you have to decide for every function what
> exactly it should do and then verify that it actually does what it is meant
> to do. Try to negotiate an extra week or two with your supervisor and then
> start small:
>
> On day one make a script that moves all files from one directory to another.
> Make sure that the source and destination directory are only stored in one
> place in your script so that they can easily be replaced. Run the script
> after every change and devise tests that demonstrate that it works
> correctly.
>
> On day two make a script that checks all files in a directory. Put the
> checks into a function that takes a file path as its only parameter.
> Write a few correct and a few broken files and test that only the correct
> ones are recognized as correct, and only the broken ones are flagged as
> broken, and that all files in the directory are checked.
>
> On day three make a script that combines the above and only moves the
> correct files into another directory. Devise tests to verify that both
> directories contain the expected files before and after your script is
> executed.
>
> On day four make a script that also moves the bad files into another
> directory. Modify your tests from the previous day to check the contents of
> all three directories.
>
> On day five wrap your previous efforts in a loop that runs forever.
> I seems to work? You are not done. Write tests that demonstrate that it does
> work. Try to think of the corner cases: what if there are no new files on
> one iteration? What if there is a new file with the same name as a previous
> one? Don't handwave, describe the reaction of your script in plain English
> with all the gory details and then translate into Python.
>
> Heureka!
>
> Note that a "day" in the above outline can be 15 minutes or a week. You're
> done when you're done. Also note that the amount of code you have written
> bears no indication of how close you are to your goal. An experienced
> programmer will end up with less code than you to fulfill the same spec.
> That shouldn't bother you. On the other hand you should remove code that
> doesn't contribute to your script's goal immediately. If you keep failed
> attempts and side tracks your code will become harder and harder to
> maintain, and that's the last thing you need when it's already hard to get
> the necessary parts right.
>
> Good luck!
@Peter: Thank you emphasizing unit testing. I will be sure to be more mindful regarding this moving forward. FYI I addressed those "gems" :) It's amazing what some sleep can do.
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web