Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #88262 > unrolled thread

Strategy/ Advice for How to Best Attack this Problem?

Started bySaran Ahluwalia <ahlusar.ahluwalia@gmail.com>
First post2015-03-29 04:32 -0700
Last post2015-04-03 19:16 -0700
Articles 20 on this page of 24 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  Strategy/ Advice for How to Best Attack this Problem? Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> - 2015-03-29 04:32 -0700
    Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> - 2015-03-29 04:37 -0700
      Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Peter Otten <__peter__@web.de> - 2015-03-29 14:33 +0200
        Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-29 05:41 -0700
        Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-01 07:17 -0700
      Re: Addendum to Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-29 09:27 -0400
    Re: Strategy/ Advice for How to Best Attack this Problem? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-03-29 19:00 -0400
    Re: Strategy/ Advice for How to Best Attack this Problem? Paul Rubin <no.email@nospam.invalid> - 2015-03-29 18:08 -0700
      Re: Strategy/ Advice for How to Best Attack this Problem? Chris Angelico <rosuav@gmail.com> - 2015-03-30 13:04 +1100
        Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-30 09:45 -0700
          Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-30 14:35 -0400
            Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-03-31 04:00 -0700
              Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-03-31 09:19 -0400
                Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-01 06:43 -0700
                  Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-01 19:51 -0400
                    Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-02 06:06 -0700
                      Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-02 17:10 -0400
                        Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-02 16:43 -0700
                          Re: Strategy/ Advice for How to Best Attack this Problem? Peter Otten <__peter__@web.de> - 2015-04-03 12:45 +0200
                            Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-03 06:40 -0700
                          Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <d@davea.name> - 2015-04-03 07:59 -0400
                            Re: Strategy/ Advice for How to Best Attack this Problem? Saran A <ahlusar.ahluwalia@gmail.com> - 2015-04-03 05:50 -0700
                              Re: Strategy/ Advice for How to Best Attack this Problem? Dave Angel <davea@davea.name> - 2015-04-03 16:21 -0400
                                Re: Strategy/ Advice for How to Best Attack this Problem? Rustom Mody <rustompmody@gmail.com> - 2015-04-03 19:16 -0700

Page 1 of 2  [1] 2  Next page →


#88262 — Strategy/ Advice for How to Best Attack this Problem?

FromSaran Ahluwalia <ahlusar.ahluwalia@gmail.com>
Date2015-03-29 04:32 -0700
SubjectStrategy/ Advice for How to Best Attack this Problem?
Message-ID<b803183b-6da5-4907-825d-9386b2bc2798@googlegroups.com>
Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?

* Monitors a folder for files that are dropped throughout the day

* When a file is dropped in the folder the program should scan the file

o IF all the records in the file have the same length

o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed

o IF the file is empty OR the records are not all of the same length

o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).

[toc] | [next] | [standalone]


#88263 — Addendum to Strategy/ Advice for How to Best Attack this Problem?

FromSaran Ahluwalia <ahlusar.ahluwalia@gmail.com>
Date2015-03-29 04:37 -0700
SubjectAddendum to Strategy/ Advice for How to Best Attack this Problem?
Message-ID<c933b8de-15b6-45b2-9fa7-2011a3615921@googlegroups.com>
In reply to#88262
On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
> 
> * Monitors a folder for files that are dropped throughout the day
> 
> * When a file is dropped in the folder the program should scan the file
> 
> o IF all the records in the file have the same length
> 
> o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
> 
> o IF the file is empty OR the records are not all of the same length
> 
> o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).

Below are some functions that I have been playing around with. I am not sure how to create a functional program from each of these constituent parts. I could use decorators or simply pass a function within another function. 

[code]
import time
import fnmatch
import os
import shutil


#If you want to write to a file, and if it doesn't exist, do this:

if not os.path.exists(filepath):
    f = open(filepath, 'w')

#If you want to read a file, and if it exists, do the following:

try:
    f = open(filepath)
except IOError:
    print 'I will be moving this to the '


#Changing a directory to "/home/newdir"
os.chdir("/home/newdir")
 
def move(src, dest):
    shutil.move(src, dest)

def fileinfo(file):
    filename = os.path.basename(file)
    rootdir = os.path.dirname(file)
    lastmod = time.ctime(os.path.getmtime(file))
    creation = time.ctime(os.path.getctime(file))
    filesize = os.path.getsize(file)

    print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)

searchdir = r'D:\Your\Directory\Root'
matches = []

def search
for root, dirnames, filenames in os.walk(searchdir):
    ##  for filename in fnmatch.filter(filenames, '*.c'):
    for filename in filenames:
        ##      matches.append(os.path.join(root, filename))
        ##print matches
        fileinfo(os.path.join(root, filename))


def get_files(src_dir):
# traverse root directory, and list directories as dirs and files as files
    for root, dirs, files in os.walk(src_dir):
        path = root.split('/')
        for file in files:
            process(os.path.join(root, file))
                    os.remove(os.path.join(root, file))

def del_dirs(src_dir):
    for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the files
        if dirpath == src_dir:
            break
        try:
            os.rmdir(dirpath)
        except OSError as ex:
            print(ex)


def main():
    get_files(src_dir)
    del_dirs(src_dir)


if __name__ == "__main__":
    main()


[/code]

[toc] | [prev] | [next] | [standalone]


#88267 — Re: Addendum to Strategy/ Advice for How to Best Attack this Problem?

FromPeter Otten <__peter__@web.de>
Date2015-03-29 14:33 +0200
SubjectRe: Addendum to Strategy/ Advice for How to Best Attack this Problem?
Message-ID<mailman.307.1427632400.10327.python-list@python.org>
In reply to#88263
Saran Ahluwalia wrote:

> On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
>> Below are the function's requirements. I am torn between using the OS
>> module or some other quick and dirty module. In addition, my ideal
>> assumption that this could be cross-platform. "Records" refers to
>> contents in a file. What are some suggestions from the Pythonistas?
>> 
>> * Monitors a folder for files that are dropped throughout the day
>> 
>> * When a file is dropped in the folder the program should scan the file
>> 
>> o IF all the records in the file have the same length
>> 
>> o THEN the file should be moved to a "success" folder and a text file
>> written indicating the total number of records processed
>> 
>> o IF the file is empty OR the records are not all of the same length
>> 
>> o THEN the file should be moved to a "failure" folder and a text file
>> written indicating the cause for failure (for example: Empty file or line
>> 100 was not the same length as the rest).
> 
> Below are some functions that I have been playing around with. I am not
> sure how to create a functional program from each of these constituent
> parts. I could use decorators or simply pass a function within another
> function.

Throwing arbitrary code at a task in the hope that something sticks is not a 
good approach. You already have given a clear description of the problem, so 
start with that and try to "pythonize" it. Example:

def main():
    while True:
        files_to_check = get_files_in_monitored_folder()
        for file in files_to_check:
            if is_good(file):
                move_to_success_folder(file)
            else:
                move_to_failure_folder(file)
        wait_a_minute()

if __name__ == "__main__":
    main()

Then write bogus implementations for the building blocks:

def get_files_in_monitored_folder():
    return ["/foo/bar/ham", "/foo/bar/spam"]

def is_good(file):
    return file.endswith("/ham")

def move_to_failure_folder(file):
    print("failure", file)

def move_to_success_folder(file):
    print("success", file)

def wait_a_minute():
    raise SystemExit("bye") # we don't want to enter the loop while 
developing

Now successively replace the dummy function with functions that do the right 
thing. Test them individually (preferrably using unit tests) so that when 
you have completed them all and your program does not work like it should 
you can be sure that there is a flaw in the main function.

> [code]
> import time
> import fnmatch
> import os
> import shutil
> 
> 
> #If you want to write to a file, and if it doesn't exist, do this:

Hm, are these your personal notes or is it an actual script?
 
> if not os.path.exists(filepath):
>     f = open(filepath, 'w')
> #If you want to read a file, and if it exists, do the following:
> 
> try:
>     f = open(filepath)
> except IOError:
>     print 'I will be moving this to the '
> 
> 
> #Changing a directory to "/home/newdir"
> os.chdir("/home/newdir")

Never using os.chdir() is a good habit to get into.

> def move(src, dest):
>     shutil.move(src, dest)
> 
> def fileinfo(file):
>     filename = os.path.basename(file)
>     rootdir = os.path.dirname(file)
>     lastmod = time.ctime(os.path.getmtime(file))
>     creation = time.ctime(os.path.getctime(file))
>     filesize = os.path.getsize(file)
> 
>     print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
>     filesize)
> 
> searchdir = r'D:\Your\Directory\Root'
> matches = []
> 
> def search

Everytime you post code that doesn't even compile you lose some goodwill.
In a few lines of code meant to demonstrate a problem a typo may be 
acceptable, but for something that you probably composed in an editor you 
should take the time to run it and fix at least the syntax errors.

> for root, dirnames, filenames in os.walk(searchdir):
>     ##  for filename in fnmatch.filter(filenames, '*.c'):
>     for filename in filenames:
>         ##      matches.append(os.path.join(root, filename))
>         ##print matches
>         fileinfo(os.path.join(root, filename))
> 
> 
> def get_files(src_dir):
> # traverse root directory, and list directories as dirs and files as files
>     for root, dirs, files in os.walk(src_dir):
>         path = root.split('/')
>         for file in files:
>             process(os.path.join(root, file))
>                     os.remove(os.path.join(root, file))
> 
> def del_dirs(src_dir):
>     for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the
>     files
>         if dirpath == src_dir:
>             break
>         try:
>             os.rmdir(dirpath)
>         except OSError as ex:
>             print(ex)
> 
> 
> def main():
>     get_files(src_dir)
>     del_dirs(src_dir)
> 
> 
> if __name__ == "__main__":
>     main()
> 
> 
> [/code]

[toc] | [prev] | [next] | [standalone]


#88268 — Re: Addendum to Strategy/ Advice for How to Best Attack this Problem?

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-03-29 05:41 -0700
SubjectRe: Addendum to Strategy/ Advice for How to Best Attack this Problem?
Message-ID<03e661d2-db49-4d2b-8524-3f5b3a07d1b8@googlegroups.com>
In reply to#88267
Thank you for the feedback - I have only been programming for 8 months now and am fairly new to best practice and what is considered acceptable in public forums. I appreciate the feedback on how to best address this problem. 

On Sunday, March 29, 2015 at 8:33:43 AM UTC-4, Peter Otten wrote:
> Saran Ahluwalia wrote:
> 
> > On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> >> Below are the function's requirements. I am torn between using the OS
> >> module or some other quick and dirty module. In addition, my ideal
> >> assumption that this could be cross-platform. "Records" refers to
> >> contents in a file. What are some suggestions from the Pythonistas?
> >> 
> >> * Monitors a folder for files that are dropped throughout the day
> >> 
> >> * When a file is dropped in the folder the program should scan the file
> >> 
> >> o IF all the records in the file have the same length
> >> 
> >> o THEN the file should be moved to a "success" folder and a text file
> >> written indicating the total number of records processed
> >> 
> >> o IF the file is empty OR the records are not all of the same length
> >> 
> >> o THEN the file should be moved to a "failure" folder and a text file
> >> written indicating the cause for failure (for example: Empty file or line
> >> 100 was not the same length as the rest).
> > 
> > Below are some functions that I have been playing around with. I am not
> > sure how to create a functional program from each of these constituent
> > parts. I could use decorators or simply pass a function within another
> > function.
> 
> Throwing arbitrary code at a task in the hope that something sticks is not a 
> good approach. You already have given a clear description of the problem, so 
> start with that and try to "pythonize" it. Example:
> 
> def main():
>     while True:
>         files_to_check = get_files_in_monitored_folder()
>         for file in files_to_check:
>             if is_good(file):
>                 move_to_success_folder(file)
>             else:
>                 move_to_failure_folder(file)
>         wait_a_minute()
> 
> if __name__ == "__main__":
>     main()
> 
> Then write bogus implementations for the building blocks:
> 
> def get_files_in_monitored_folder():
>     return ["/foo/bar/ham", "/foo/bar/spam"]
> 
> def is_good(file):
>     return file.endswith("/ham")
> 
> def move_to_failure_folder(file):
>     print("failure", file)
> 
> def move_to_success_folder(file):
>     print("success", file)
> 
> def wait_a_minute():
>     raise SystemExit("bye") # we don't want to enter the loop while 
> developing
> 
> Now successively replace the dummy function with functions that do the right 
> thing. Test them individually (preferrably using unit tests) so that when 
> you have completed them all and your program does not work like it should 
> you can be sure that there is a flaw in the main function.
> 
> > [code]
> > import time
> > import fnmatch
> > import os
> > import shutil
> > 
> > 
> > #If you want to write to a file, and if it doesn't exist, do this:
> 
> Hm, are these your personal notes or is it an actual script?
>  
> > if not os.path.exists(filepath):
> >     f = open(filepath, 'w')
> > #If you want to read a file, and if it exists, do the following:
> > 
> > try:
> >     f = open(filepath)
> > except IOError:
> >     print 'I will be moving this to the '
> > 
> > 
> > #Changing a directory to "/home/newdir"
> > os.chdir("/home/newdir")
> 
> Never using os.chdir() is a good habit to get into.
> 
> > def move(src, dest):
> >     shutil.move(src, dest)
> > 
> > def fileinfo(file):
> >     filename = os.path.basename(file)
> >     rootdir = os.path.dirname(file)
> >     lastmod = time.ctime(os.path.getmtime(file))
> >     creation = time.ctime(os.path.getctime(file))
> >     filesize = os.path.getsize(file)
> > 
> >     print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
> >     filesize)
> > 
> > searchdir = r'D:\Your\Directory\Root'
> > matches = []
> > 
> > def search
> 
> Everytime you post code that doesn't even compile you lose some goodwill.
> In a few lines of code meant to demonstrate a problem a typo may be 
> acceptable, but for something that you probably composed in an editor you 
> should take the time to run it and fix at least the syntax errors.
> 
> > for root, dirnames, filenames in os.walk(searchdir):
> >     ##  for filename in fnmatch.filter(filenames, '*.c'):
> >     for filename in filenames:
> >         ##      matches.append(os.path.join(root, filename))
> >         ##print matches
> >         fileinfo(os.path.join(root, filename))
> > 
> > 
> > def get_files(src_dir):
> > # traverse root directory, and list directories as dirs and files as files
> >     for root, dirs, files in os.walk(src_dir):
> >         path = root.split('/')
> >         for file in files:
> >             process(os.path.join(root, file))
> >                     os.remove(os.path.join(root, file))
> > 
> > def del_dirs(src_dir):
> >     for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the
> >     files
> >         if dirpath == src_dir:
> >             break
> >         try:
> >             os.rmdir(dirpath)
> >         except OSError as ex:
> >             print(ex)
> > 
> > 
> > def main():
> >     get_files(src_dir)
> >     del_dirs(src_dir)
> > 
> > 
> > if __name__ == "__main__":
> >     main()
> > 
> > 
> > [/code]

[toc] | [prev] | [next] | [standalone]


#88425 — Re: Addendum to Strategy/ Advice for How to Best Attack this Problem?

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-04-01 07:17 -0700
SubjectRe: Addendum to Strategy/ Advice for How to Best Attack this Problem?
Message-ID<fc09b6ca-3207-4324-a821-eedd03feecc5@googlegroups.com>
In reply to#88267
On Sunday, March 29, 2015 at 8:33:43 AM UTC-4, Peter Otten wrote:
> Saran Ahluwalia wrote:
> 
> > On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
> >> Below are the function's requirements. I am torn between using the OS
> >> module or some other quick and dirty module. In addition, my ideal
> >> assumption that this could be cross-platform. "Records" refers to
> >> contents in a file. What are some suggestions from the Pythonistas?
> >> 
> >> * Monitors a folder for files that are dropped throughout the day
> >> 
> >> * When a file is dropped in the folder the program should scan the file
> >> 
> >> o IF all the records in the file have the same length
> >> 
> >> o THEN the file should be moved to a "success" folder and a text file
> >> written indicating the total number of records processed
> >> 
> >> o IF the file is empty OR the records are not all of the same length
> >> 
> >> o THEN the file should be moved to a "failure" folder and a text file
> >> written indicating the cause for failure (for example: Empty file or line
> >> 100 was not the same length as the rest).
> > 
> > Below are some functions that I have been playing around with. I am not
> > sure how to create a functional program from each of these constituent
> > parts. I could use decorators or simply pass a function within another
> > function.
> 
> Throwing arbitrary code at a task in the hope that something sticks is not a 
> good approach. You already have given a clear description of the problem, so 
> start with that and try to "pythonize" it. Example:
> 
> def main():
>     while True:
>         files_to_check = get_files_in_monitored_folder()
>         for file in files_to_check:
>             if is_good(file):
>                 move_to_success_folder(file)
>             else:
>                 move_to_failure_folder(file)
>         wait_a_minute()
> 
> if __name__ == "__main__":
>     main()
> 
> Then write bogus implementations for the building blocks:
> 
> def get_files_in_monitored_folder():
>     return ["/foo/bar/ham", "/foo/bar/spam"]
> 
> def is_good(file):
>     return file.endswith("/ham")
> 
> def move_to_failure_folder(file):
>     print("failure", file)
> 
> def move_to_success_folder(file):
>     print("success", file)
> 
> def wait_a_minute():
>     raise SystemExit("bye") # we don't want to enter the loop while 
> developing
> 
> Now successively replace the dummy function with functions that do the right 
> thing. Test them individually (preferrably using unit tests) so that when 
> you have completed them all and your program does not work like it should 
> you can be sure that there is a flaw in the main function.
> 
> > [code]
> > import time
> > import fnmatch
> > import os
> > import shutil
> > 
> > 
> > #If you want to write to a file, and if it doesn't exist, do this:
> 
> Hm, are these your personal notes or is it an actual script?
>  
> > if not os.path.exists(filepath):
> >     f = open(filepath, 'w')
> > #If you want to read a file, and if it exists, do the following:
> > 
> > try:
> >     f = open(filepath)
> > except IOError:
> >     print 'I will be moving this to the '
> > 
> > 
> > #Changing a directory to "/home/newdir"
> > os.chdir("/home/newdir")
> 
> Never using os.chdir() is a good habit to get into.
> 
> > def move(src, dest):
> >     shutil.move(src, dest)
> > 
> > def fileinfo(file):
> >     filename = os.path.basename(file)
> >     rootdir = os.path.dirname(file)
> >     lastmod = time.ctime(os.path.getmtime(file))
> >     creation = time.ctime(os.path.getctime(file))
> >     filesize = os.path.getsize(file)
> > 
> >     print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation,
> >     filesize)
> > 
> > searchdir = r'D:\Your\Directory\Root'
> > matches = []
> > 
> > def search
> 
> Everytime you post code that doesn't even compile you lose some goodwill.
> In a few lines of code meant to demonstrate a problem a typo may be 
> acceptable, but for something that you probably composed in an editor you 
> should take the time to run it and fix at least the syntax errors.
> 
> > for root, dirnames, filenames in os.walk(searchdir):
> >     ##  for filename in fnmatch.filter(filenames, '*.c'):
> >     for filename in filenames:
> >         ##      matches.append(os.path.join(root, filename))
> >         ##print matches
> >         fileinfo(os.path.join(root, filename))
> > 
> > 
> > def get_files(src_dir):
> > # traverse root directory, and list directories as dirs and files as files
> >     for root, dirs, files in os.walk(src_dir):
> >         path = root.split('/')
> >         for file in files:
> >             process(os.path.join(root, file))
> >                     os.remove(os.path.join(root, file))
> > 
> > def del_dirs(src_dir):
> >     for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the
> >     files
> >         if dirpath == src_dir:
> >             break
> >         try:
> >             os.rmdir(dirpath)
> >         except OSError as ex:
> >             print(ex)
> > 
> > 
> > def main():
> >     get_files(src_dir)
> >     del_dirs(src_dir)
> > 
> > 
> > if __name__ == "__main__":
> >     main()
> > 
> > 
> > [/code]

@PeterOtten

Thank you for the feedback - since your last correspondence I have come up with this (my most recent commit: 

https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py

[toc] | [prev] | [next] | [standalone]


#88269 — Re: Addendum to Strategy/ Advice for How to Best Attack this Problem?

FromDave Angel <davea@davea.name>
Date2015-03-29 09:27 -0400
SubjectRe: Addendum to Strategy/ Advice for How to Best Attack this Problem?
Message-ID<mailman.308.1427635674.10327.python-list@python.org>
In reply to#88263
On 03/29/2015 07:37 AM, Saran Ahluwalia wrote:
> On Sunday, March 29, 2015 at 7:33:04 AM UTC-4, Saran Ahluwalia wrote:
>> Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
>>
>> * Monitors a folder for files that are dropped throughout the day
>>
>> * When a file is dropped in the folder the program should scan the file
>>
>> o IF all the records in the file have the same length
>>
>> o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
>>
>> o IF the file is empty OR the records are not all of the same length
>>
>> o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).
>
> Below are some functions that I have been playing around with. I am not sure how to create a functional program from each of these constituent parts. I could use decorators or simply pass a function within another function.

Your problem isn't complicated enough to either need function objects or 
decorators.  You might want to write a generator function, but even that 
seems overkill for the problem as stated.  Just write the code, 
top-down, with dummy bodies containing stub code.  Then fill it in from 
bottom up, with unit tests for each completed function.

More complex problems can justify a different approach, but you don't 
need to use every trick in the arsenal.

>
> [code]
> import time
> import fnmatch
> import os
> import shutil
>

If you have code fragments that aren't going to be used, don't write 
them as top-level code.  Either move them to another file, or at least 
enclose them in a function with a name like   dummy_do_not_use()

My own convention for that is to suffix the function name with a bunch 
of uppercase ZZZ's  That way the name jumps out at me so I'll recognize 
it, and I can be sure I'll never actually call it.

>
> #If you want to write to a file, and if it doesn't exist, do this:
>
> if not os.path.exists(filepath):
>      f = open(filepath, 'w')
>
> #If you want to read a file, and if it exists, do the following:
>
> try:
>      f = open(filepath)
> except IOError:
>      print 'I will be moving this to the '
>
>
> #Changing a directory to "/home/newdir"
> os.chdir("/home/newdir")

As Peter said, chdir can be very troublesome.  Avoid at almost all 
costs.  As you've done elsewhere, use os.path.join() to combine 
directory paths with relative filenames.

>
> def move(src, dest):
>      shutil.move(src, dest)
>
> def fileinfo(file):
>      filename = os.path.basename(file)
>      rootdir = os.path.dirname(file)
>      lastmod = time.ctime(os.path.getmtime(file))
>      creation = time.ctime(os.path.getctime(file))
>      filesize = os.path.getsize(file)
>
>      print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)
>
> searchdir = r'D:\Your\Directory\Root'
> matches = []
>
> def search
> for root, dirnames, filenames in os.walk(searchdir):

Why are you using a directory tree when your "spec" said the files would 
be in a specific directory?

>      ##  for filename in fnmatch.filter(filenames, '*.c'):
>      for filename in filenames:
>          ##      matches.append(os.path.join(root, filename))
>          ##print matches
>          fileinfo(os.path.join(root, filename))
>
>
> def get_files(src_dir):
> # traverse root directory, and list directories as dirs and files as files
>      for root, dirs, files in os.walk(src_dir):
>          path = root.split('/')
>          for file in files:
>              process(os.path.join(root, file))
>                      os.remove(os.path.join(root, file))

Probably you shouldn't have os.remove in the code till the stuff around 
it has been carefully tested.  Besides, nothing in the spec says you're 
going to remove any files.

>
> def del_dirs(src_dir):
>      for dirpath, _, _ in os.walk(src_dir, topdown=False):  # Listing the files
>          if dirpath == src_dir:
>              break
>          try:
>              os.rmdir(dirpath)
>          except OSError as ex:
>              print(ex)
>
>
> def main():
>      get_files(src_dir)
>      del_dirs(src_dir)
>

Your description says "monitor".  That implies to me an ongoing process, 
or a loop.   You probably want something like:

def main():
     while True:
         process_files(directory_name)
         sleep(10000)

>
> if __name__ == "__main__":
>      main()
>
>
> [/code]
>


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#88295

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2015-03-29 19:00 -0400
Message-ID<mailman.321.1427670045.10327.python-list@python.org>
In reply to#88262
On Sun, 29 Mar 2015 04:32:48 -0700 (PDT), Saran Ahluwalia
<ahlusar.ahluwalia@gmail.com> declaimed the following:

>Below are the function's requirements. I am torn between using the OS module or some other quick and dirty module. In addition, my ideal assumption that this could be cross-platform. "Records" refers to contents in a file. What are some suggestions from the Pythonistas?
>
>* Monitors a folder for files that are dropped throughout the day
>
	Other than spinning over a time.sleep() call, this is likely going to
be OS specific:

	OS may provide a call-back/notification of when a directory is changed
	OS may provide a scheduled task capability that would let your task
start at regular intervals -- you'd have to maintain some file identifying
the last processed time stamp (maybe). Linux/UNIX cron jobs, Windows task
scheduler

>* When a file is dropped in the folder the program should scan the file
>
	What happens if the file is still being written? Do you record the new
file in one pass, then delay and process those files the next pass
(recording new files for later processing)?

>o IF all the records in the file have the same length
>
>o THEN the file should be moved to a "success" folder and a text file written indicating the total number of records processed
>
	Text file? or maybe using the logging system?

>o IF the file is empty OR the records are not all of the same length
>
>o THEN the file should be moved to a "failure" folder and a text file written indicating the cause for failure (for example: Empty file or line 100 was not the same length as the rest).

	How do you intend such a report to be determined? That is, what defines
the "correct length" of the lines? The first record read? If that one is
short, you're now reporting ALL OTHER lines as erroneous. Read all lines
building a histogram of lengths and taking the mode, then reprocessing to
flag lines that do not meet the mode?
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#88299

FromPaul Rubin <no.email@nospam.invalid>
Date2015-03-29 18:08 -0700
Message-ID<87a8yvs34u.fsf@jester.gateway.pace.com>
In reply to#88262
Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
> cross-platform...
> * Monitors a folder for files that are dropped throughout the day

I don't see a cross-platform way to do that other than by waking up and
scanning the folder every so often (once a minute, say).  The Linux way
is with inotify and there's a Python module for it (search terms: python
inotify).  There might be comparable but non-identical interfaces for
other platforms.

[toc] | [prev] | [next] | [standalone]


#88302

FromChris Angelico <rosuav@gmail.com>
Date2015-03-30 13:04 +1100
Message-ID<mailman.323.1427681070.10327.python-list@python.org>
In reply to#88299
On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
>> cross-platform...
>> * Monitors a folder for files that are dropped throughout the day
>
> I don't see a cross-platform way to do that other than by waking up and
> scanning the folder every so often (once a minute, say).  The Linux way
> is with inotify and there's a Python module for it (search terms: python
> inotify).  There might be comparable but non-identical interfaces for
> other platforms.

All too often, "cross-platform" means probing for one option, then
another, then another, and using whichever one you can. On Windows,
there's FindFirstChangeNotification and ReadDirectoryChanges, which
Tim Golden wrote about, and which I coded up into a teleporter for
getting files out of a VM automatically:

http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
https://github.com/Rosuav/shed/blob/master/senddir.py

ChrisA

[toc] | [prev] | [next] | [standalone]


#88334

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-03-30 09:45 -0700
Message-ID<bc19cb8e-9e4f-48a7-a9f5-4b39db09a7ce@googlegroups.com>
In reply to#88302
On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote:
> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> > Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
> >> cross-platform...
> >> * Monitors a folder for files that are dropped throughout the day
> >
> > I don't see a cross-platform way to do that other than by waking up and
> > scanning the folder every so often (once a minute, say).  The Linux way
> > is with inotify and there's a Python module for it (search terms: python
> > inotify).  There might be comparable but non-identical interfaces for
> > other platforms.
> 
> All too often, "cross-platform" means probing for one option, then
> another, then another, and using whichever one you can. On Windows,
> there's FindFirstChangeNotification and ReadDirectoryChanges, which
> Tim Golden wrote about, and which I coded up into a teleporter for
> getting files out of a VM automatically:
> 
> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
> https://github.com/Rosuav/shed/blob/master/senddir.py
> 
> ChrisA

@Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py

I welcome your thoughts on this

[toc] | [prev] | [next] | [standalone]


#88339

FromDave Angel <davea@davea.name>
Date2015-03-30 14:35 -0400
Message-ID<mailman.344.1427740538.10327.python-list@python.org>
In reply to#88334
On 03/30/2015 12:45 PM, Saran A wrote:
> On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote:
>> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote:
>>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
>>>> cross-platform...
>>>> * Monitors a folder for files that are dropped throughout the day
>>>
>>> I don't see a cross-platform way to do that other than by waking up and
>>> scanning the folder every so often (once a minute, say).  The Linux way
>>> is with inotify and there's a Python module for it (search terms: python
>>> inotify).  There might be comparable but non-identical interfaces for
>>> other platforms.
>>
>> All too often, "cross-platform" means probing for one option, then
>> another, then another, and using whichever one you can. On Windows,
>> there's FindFirstChangeNotification and ReadDirectoryChanges, which
>> Tim Golden wrote about, and which I coded up into a teleporter for
>> getting files out of a VM automatically:
>>
>> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
>> https://github.com/Rosuav/shed/blob/master/senddir.py
>>
>> ChrisA
>
> @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py
>
> I welcome your thoughts on this
>

It's missing a number of your requirements.  But it's a start.

If it were my file, I'd have a TODO comment at the bottom stating known 
changes that are needed.  In it, I'd mention:

1) your present code is assuming all filenames come directly from the 
commandline.  No searching of a directory.

2) your present code does not move any files to success or failure 
directories

3) your present code doesn't calculate or write to a text file any 
statistics.

4) your present code runs once through the names, and terminates.  It 
doesn't "monitor" anything.

5) your present code doesn't check for zero-length files

I'd also wonder why you bother checking whether the 
os.path.getsize(file) function returns the same value as the os.SEEK_END 
and ftell() code does.  Is it that you don't trust the library?  Or that 
you have to run on Windows, where the line-ending logic can change the 
apparent file size?

I notice you're not specifying a file mode on the open.  So in Python 3, 
your sizes are going to be specified in unicode characters after 
decoding.  Is that what the spec says?  It's probably safer to 
explicitly specify the mode (and the file encoding if you're in text).

I see you call strip() before comparing the length.  Could there ever be 
leading or trailing whitespace that's significant?  Is that the actual 
specification of line size?

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#88372

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-03-31 04:00 -0700
Message-ID<0623f75c-bf93-4a0f-9d87-86986185cdc3@googlegroups.com>
In reply to#88339
On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
> On 03/30/2015 12:45 PM, Saran A wrote:
> > On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote:
> >> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> >>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
> >>>> cross-platform...
> >>>> * Monitors a folder for files that are dropped throughout the day
> >>>
> >>> I don't see a cross-platform way to do that other than by waking up and
> >>> scanning the folder every so often (once a minute, say).  The Linux way
> >>> is with inotify and there's a Python module for it (search terms: python
> >>> inotify).  There might be comparable but non-identical interfaces for
> >>> other platforms.
> >>
> >> All too often, "cross-platform" means probing for one option, then
> >> another, then another, and using whichever one you can. On Windows,
> >> there's FindFirstChangeNotification and ReadDirectoryChanges, which
> >> Tim Golden wrote about, and which I coded up into a teleporter for
> >> getting files out of a VM automatically:
> >>
> >> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
> >> https://github.com/Rosuav/shed/blob/master/senddir.py
> >>
> >> ChrisA
> >
> > @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py
> >
> > I welcome your thoughts on this
> >
> 
> It's missing a number of your requirements.  But it's a start.
> 
> If it were my file, I'd have a TODO comment at the bottom stating known 
> changes that are needed.  In it, I'd mention:
> 
> 1) your present code is assuming all filenames come directly from the 
> commandline.  No searching of a directory.
> 
> 2) your present code does not move any files to success or failure 
> directories
> 
> 3) your present code doesn't calculate or write to a text file any 
> statistics.
> 
> 4) your present code runs once through the names, and terminates.  It 
> doesn't "monitor" anything.
> 
> 5) your present code doesn't check for zero-length files
> 
> I'd also wonder why you bother checking whether the 
> os.path.getsize(file) function returns the same value as the os.SEEK_END 
> and ftell() code does.  Is it that you don't trust the library?  Or that 
> you have to run on Windows, where the line-ending logic can change the 
> apparent file size?
> 
> I notice you're not specifying a file mode on the open.  So in Python 3, 
> your sizes are going to be specified in unicode characters after 
> decoding.  Is that what the spec says?  It's probably safer to 
> explicitly specify the mode (and the file encoding if you're in text).
> 
> I see you call strip() before comparing the length.  Could there ever be 
> leading or trailing whitespace that's significant?  Is that the actual 
> specification of line size?
> 
> -- 
> DaveA

@ Dave A

On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
> On 03/30/2015 12:45 PM, Saran A wrote:
> > On Sunday, March 29, 2015 at 10:04:45 PM UTC-4, Chris Angelico wrote:
> >> On Mon, Mar 30, 2015 at 12:08 PM, Paul Rubin <no.email@nospam.invalid> wrote:
> >>> Saran Ahluwalia <ahlusar.ahluwalia@gmail.com> writes:
> >>>> cross-platform...
> >>>> * Monitors a folder for files that are dropped throughout the day
> >>>
> >>> I don't see a cross-platform way to do that other than by waking up and
> >>> scanning the folder every so often (once a minute, say).  The Linux way
> >>> is with inotify and there's a Python module for it (search terms: python
> >>> inotify).  There might be comparable but non-identical interfaces for
> >>> other platforms.
> >>
> >> All too often, "cross-platform" means probing for one option, then
> >> another, then another, and using whichever one you can. On Windows,
> >> there's FindFirstChangeNotification and ReadDirectoryChanges, which
> >> Tim Golden wrote about, and which I coded up into a teleporter for
> >> getting files out of a VM automatically:
> >>
> >> http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
> >> https://github.com/Rosuav/shed/blob/master/senddir.py
> >>
> >> ChrisA
> >
> > @Dave, Chris, Paul and Dennis: Thank you for resources and the notes regarding what I should keep in mind. I have an initial commit: https://github.com/ahlusar1989/IntroToPython/blob/master/Project1WG_with_assumptions_and_comments.py
> >
> > I welcome your thoughts on this
> >
> 
> It's missing a number of your requirements.  But it's a start.
> 
> If it were my file, I'd have a TODO comment at the bottom stating known 
> changes that are needed.  In it, I'd mention:
> 
> 1) your present code is assuming all filenames come directly from the 
> commandline.  No searching of a directory.
> 
> 2) your present code does not move any files to success or failure 
> directories
> 
> 3) your present code doesn't calculate or write to a text file any 
> statistics.
> 
> 4) your present code runs once through the names, and terminates.  It 
> doesn't "monitor" anything.
> 
> 5) your present code doesn't check for zero-length files
> 
> I'd also wonder why you bother checking whether the 
> os.path.getsize(file) function returns the same value as the os.SEEK_END 
> and ftell() code does.  Is it that you don't trust the library?  Or that 
> you have to run on Windows, where the line-ending logic can change the 
> apparent file size?
> 
> I notice you're not specifying a file mode on the open.  So in Python 3, 
> your sizes are going to be specified in unicode characters after 
> decoding.  Is that what the spec says?  It's probably safer to 
> explicitly specify the mode (and the file encoding if you're in text).
> 
> I see you call strip() before comparing the length.  Could there ever be 
> leading or trailing whitespace that's significant?  Is that the actual 
> specification of line size?
> 
> -- 
> DaveA


@DaveA: This is a homework assignment. As someone is not familiar (I have only been programming for 8 months) with best practice for writing to files, appending to folders and searching a directory. Is it possible that you could provide me with some snippets or guidance on where to place your suggestions (for your TO DOs 2,3,4,5)?

I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice. 

Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.

[toc] | [prev] | [next] | [standalone]


#88381

FromDave Angel <davea@davea.name>
Date2015-03-31 09:19 -0400
Message-ID<mailman.371.1427807964.10327.python-list@python.org>
In reply to#88372
On 03/31/2015 07:00 AM, Saran A wrote:

 > @DaveA: This is a homework assignment. .... Is it possible that you 
could provide me with some snippets or guidance on where to place your 
suggestions (for your TO DOs 2,3,4,5)?
 >


> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:

>>
>> It's missing a number of your requirements.  But it's a start.
>>
>> If it were my file, I'd have a TODO comment at the bottom stating known
>> changes that are needed.  In it, I'd mention:
>>
>> 1) your present code is assuming all filenames come directly from the
>> commandline.  No searching of a directory.
>>
>> 2) your present code does not move any files to success or failure
>> directories
>>

In function validate_files()
Just after the line
                 print('success with %s on %d reco...
you could move the file, using shutil.  Likewise after the failure print.

>> 3) your present code doesn't calculate or write to a text file any
>> statistics.

You successfully print to sys.stderr.  So you could print to some other 
file in the exact same way.

>>
>> 4) your present code runs once through the names, and terminates.  It
>> doesn't "monitor" anything.

Make a new function, perhaps called main(), with a loop that calls 
validate_files(), with a sleep after each pass.  Of course, unless you 
fix TODO#1, that'll keep looking for the same files.  No harm in that if 
that's the spec, since you moved the earlier versions of the files.

But if you want to "monitor" the directory, let the directory name be 
the argument to main, and let main do a dirlist each time through the 
loop, and pass the corresponding list to validate_files.

>>
>> 5) your present code doesn't check for zero-length files
>>

In validate_and_process_data(), instead of checking filesize against 
ftell, check it against zero.

>> I'd also wonder why you bother checking whether the
>> os.path.getsize(file) function returns the same value as the os.SEEK_END
>> and ftell() code does.  Is it that you don't trust the library?  Or that
>> you have to run on Windows, where the line-ending logic can change the
>> apparent file size?
>>
>> I notice you're not specifying a file mode on the open.  So in Python 3,
>> your sizes are going to be specified in unicode characters after
>> decoding.  Is that what the spec says?  It's probably safer to
>> explicitly specify the mode (and the file encoding if you're in text).
>>
>> I see you call strip() before comparing the length.  Could there ever be
>> leading or trailing whitespace that's significant?  Is that the actual
>> specification of line size?
>>
>> --
>> DaveA
>
>

> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
>
> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
>


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#88423

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-04-01 06:43 -0700
Message-ID<fcd8bbbc-2d81-4797-aa0e-090683e72f3f@googlegroups.com>
In reply to#88381
On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
> On 03/31/2015 07:00 AM, Saran A wrote:
> 
>  > @DaveA: This is a homework assignment. .... Is it possible that you 
> could provide me with some snippets or guidance on where to place your 
> suggestions (for your TO DOs 2,3,4,5)?
>  >
> 
> 
> > On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
> 
> >>
> >> It's missing a number of your requirements.  But it's a start.
> >>
> >> If it were my file, I'd have a TODO comment at the bottom stating known
> >> changes that are needed.  In it, I'd mention:
> >>
> >> 1) your present code is assuming all filenames come directly from the
> >> commandline.  No searching of a directory.
> >>
> >> 2) your present code does not move any files to success or failure
> >> directories
> >>
> 
> In function validate_files()
> Just after the line
>                  print('success with %s on %d reco...
> you could move the file, using shutil.  Likewise after the failure print.
> 
> >> 3) your present code doesn't calculate or write to a text file any
> >> statistics.
> 
> You successfully print to sys.stderr.  So you could print to some other 
> file in the exact same way.
> 
> >>
> >> 4) your present code runs once through the names, and terminates.  It
> >> doesn't "monitor" anything.
> 
> Make a new function, perhaps called main(), with a loop that calls 
> validate_files(), with a sleep after each pass.  Of course, unless you 
> fix TODO#1, that'll keep looking for the same files.  No harm in that if 
> that's the spec, since you moved the earlier versions of the files.
> 
> But if you want to "monitor" the directory, let the directory name be 
> the argument to main, and let main do a dirlist each time through the 
> loop, and pass the corresponding list to validate_files.
> 
> >>
> >> 5) your present code doesn't check for zero-length files
> >>
> 
> In validate_and_process_data(), instead of checking filesize against 
> ftell, check it against zero.
> 
> >> I'd also wonder why you bother checking whether the
> >> os.path.getsize(file) function returns the same value as the os.SEEK_END
> >> and ftell() code does.  Is it that you don't trust the library?  Or that
> >> you have to run on Windows, where the line-ending logic can change the
> >> apparent file size?
> >>
> >> I notice you're not specifying a file mode on the open.  So in Python 3,
> >> your sizes are going to be specified in unicode characters after
> >> decoding.  Is that what the spec says?  It's probably safer to
> >> explicitly specify the mode (and the file encoding if you're in text).
> >>
> >> I see you call strip() before comparing the length.  Could there ever be
> >> leading or trailing whitespace that's significant?  Is that the actual
> >> specification of line size?
> >>
> >> --
> >> DaveA
> >
> >
> 
> > I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
> >
> > Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
> >
> 
> 
> -- 
> DaveA

@DaveA

My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file. 

I have attempted to address the functional requirements that you brought up:

1) Before, my present code was assuming all filenames come directly from the commandline.  No searching of a directory. I think that I have addressed this. 

2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement. 

3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error. (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)

4) Before, my present code runs once through the names, and terminates.  It doesn't "monitor" anything. I think I have addressed this with the  main function - correct?

5) Before, my present code doesn't check for zero-length files  - I have added a comment there in case that is needed) 

I realize appreciate your invaluable feedback. I have grown a lot with this assignment!

Sincerely,

Saran

[toc] | [prev] | [next] | [standalone]


#88437

FromDave Angel <davea@davea.name>
Date2015-04-01 19:51 -0400
Message-ID<mailman.16.1427932335.12925.python-list@python.org>
In reply to#88423
On 04/01/2015 09:43 AM, Saran A wrote:
> On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
>> On 03/31/2015 07:00 AM, Saran A wrote:
>>
>>   > @DaveA: This is a homework assignment. .... Is it possible that you
>> could provide me with some snippets or guidance on where to place your
>> suggestions (for your TO DOs 2,3,4,5)?
>>   >
>>
>>
>>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
>>
>>>>
>>>> It's missing a number of your requirements.  But it's a start.
>>>>
>>>> If it were my file, I'd have a TODO comment at the bottom stating known
>>>> changes that are needed.  In it, I'd mention:
>>>>
>>>> 1) your present code is assuming all filenames come directly from the
>>>> commandline.  No searching of a directory.
>>>>
>>>> 2) your present code does not move any files to success or failure
>>>> directories
>>>>
>>
>> In function validate_files()
>> Just after the line
>>                   print('success with %s on %d reco...
>> you could move the file, using shutil.  Likewise after the failure print.
>>
>>>> 3) your present code doesn't calculate or write to a text file any
>>>> statistics.
>>
>> You successfully print to sys.stderr.  So you could print to some other
>> file in the exact same way.
>>
>>>>
>>>> 4) your present code runs once through the names, and terminates.  It
>>>> doesn't "monitor" anything.
>>
>> Make a new function, perhaps called main(), with a loop that calls
>> validate_files(), with a sleep after each pass.  Of course, unless you
>> fix TODO#1, that'll keep looking for the same files.  No harm in that if
>> that's the spec, since you moved the earlier versions of the files.
>>
>> But if you want to "monitor" the directory, let the directory name be
>> the argument to main, and let main do a dirlist each time through the
>> loop, and pass the corresponding list to validate_files.
>>
>>>>
>>>> 5) your present code doesn't check for zero-length files
>>>>
>>
>> In validate_and_process_data(), instead of checking filesize against
>> ftell, check it against zero.
>>
>>>> I'd also wonder why you bother checking whether the
>>>> os.path.getsize(file) function returns the same value as the os.SEEK_END
>>>> and ftell() code does.  Is it that you don't trust the library?  Or that
>>>> you have to run on Windows, where the line-ending logic can change the
>>>> apparent file size?
>>>>
>>>> I notice you're not specifying a file mode on the open.  So in Python 3,
>>>> your sizes are going to be specified in unicode characters after
>>>> decoding.  Is that what the spec says?  It's probably safer to
>>>> explicitly specify the mode (and the file encoding if you're in text).
>>>>
>>>> I see you call strip() before comparing the length.  Could there ever be
>>>> leading or trailing whitespace that's significant?  Is that the actual
>>>> specification of line size?
>>>>
>>>> --
>>>> DaveA
>>>
>>>
>>
>>> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
>>>
>>> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
>>>
>>
>>
>> --
>> DaveA
>
> @DaveA
>
> My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.

Perhaps you don't realize how github works.  The whole point is it 
preserves the history of your code, and you use the same filename for 
each revision.

Or possibly it's I that doesn't understand it.  I use git, but haven't 
actually used github for my own code.

>
> I have attempted to address the functional requirements that you brought up:
>
> 1) Before, my present code was assuming all filenames come directly from the commandline.  No searching of a directory. I think that I have addressed this.
>

Have you even tried to run the code?  It quits immediately with an 
exception since your call to main() doesn't pass any arguments, and main 
requires one.

 > def main(dirslist):
 >     while True:
 >         for file in dirslist:
 >         	return validate_files(file)
 >         	time.sleep(5)

In addition, you aren't actually doing anything to find what the files 
in the directory are.  I tried to refer to dirlist, as a hint.  A 
stronger hint:  look up  os.listdir()

And that list of files has to change each time through the while loop, 
that's the whole meaning of scanning.  You don't just grab the names 
once, you look to see what's there.

The next thing is that you're using a variable called 'file', while 
that's a built-in type in Python.  So you really want to use a different 
name.

Next, you have a loop through the magical dirslist to get individual 
filenames, but then you call the validate_files() function with a single 
file, but that function is expecting a list of filenames.  One or the 
other has to change.

Next, you return from main after validating the first file, so no others 
will get processed.

> 2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.

You had a function to do that in your very first post to this thread. 
Have you forgotten it already?  As for syntax errors, you can't expect 
any help on that when you don't show any code nor the syntax error 
traceback.  Remember that when a syntax error is shown, it's frequently 
the previous line or two that actually was wrong.

>
> 3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error.

 > #I wrote an error class in case the I needed such a specification for 
a notifcation - this is not in the requirements
 > class ErrorHandler:
 >
 >     def __init__(self):
 >         pass
 >
 >     def write(self, string):
 >
 >         # write error to file
 >         fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p") 
+ ").txt"
 >         handler = open(fname, "w")
 >         handler.write(string)
 >         handler.close()




This looks like Java code.  With no data attributes, and only one 
method(function), what's the reason you don't just use a function?


original_stderr = sys.stderr # stored in case want to revert
sys.stderr = ErrorHandler()

This is just bogus.  There are rare times when a programmer might want 
to replace stderr, but that's just unnecessarily confusing in a program 
like this one.  When you want to write printable data to another file, 
just use that file's handle as the argument to the file= argument of 
print().  You could use an instance of ErrorHandler as a file handle. 
Or do it one of a dozen other straightforward ways.

While we're at it, why do you keep creating new files for your error 
report?  And overwriting all the old messages with whatever new ones 
come in the same minute?


> (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)

No idea what you mean here.  What metadata?  And what do copy and copy2 
have to do with anything here?

>
> 4) Before, my present code runs once through the names, and terminates.  It doesn't "monitor" anything. I think I have addressed this with the  main function - correct?
>
Not even close.

> 5) Before, my present code doesn't check for zero-length files  - I have added a comment there in case that is needed)
>
> I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
>
> Sincerely,
>
> Saran
>


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#88448

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-04-02 06:06 -0700
Message-ID<ba29c06d-3cdc-412e-90a0-b1119a360570@googlegroups.com>
In reply to#88437
On Wednesday, April 1, 2015 at 7:52:27 PM UTC-4, Dave Angel wrote:
> On 04/01/2015 09:43 AM, Saran A wrote:
> > On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
> >> On 03/31/2015 07:00 AM, Saran A wrote:
> >>
> >>   > @DaveA: This is a homework assignment. .... Is it possible that you
> >> could provide me with some snippets or guidance on where to place your
> >> suggestions (for your TO DOs 2,3,4,5)?
> >>   >
> >>
> >>
> >>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
> >>
> >>>>
> >>>> It's missing a number of your requirements.  But it's a start.
> >>>>
> >>>> If it were my file, I'd have a TODO comment at the bottom stating known
> >>>> changes that are needed.  In it, I'd mention:
> >>>>
> >>>> 1) your present code is assuming all filenames come directly from the
> >>>> commandline.  No searching of a directory.
> >>>>
> >>>> 2) your present code does not move any files to success or failure
> >>>> directories
> >>>>
> >>
> >> In function validate_files()
> >> Just after the line
> >>                   print('success with %s on %d reco...
> >> you could move the file, using shutil.  Likewise after the failure print.
> >>
> >>>> 3) your present code doesn't calculate or write to a text file any
> >>>> statistics.
> >>
> >> You successfully print to sys.stderr.  So you could print to some other
> >> file in the exact same way.
> >>
> >>>>
> >>>> 4) your present code runs once through the names, and terminates.  It
> >>>> doesn't "monitor" anything.
> >>
> >> Make a new function, perhaps called main(), with a loop that calls
> >> validate_files(), with a sleep after each pass.  Of course, unless you
> >> fix TODO#1, that'll keep looking for the same files.  No harm in that if
> >> that's the spec, since you moved the earlier versions of the files.
> >>
> >> But if you want to "monitor" the directory, let the directory name be
> >> the argument to main, and let main do a dirlist each time through the
> >> loop, and pass the corresponding list to validate_files.
> >>
> >>>>
> >>>> 5) your present code doesn't check for zero-length files
> >>>>
> >>
> >> In validate_and_process_data(), instead of checking filesize against
> >> ftell, check it against zero.
> >>
> >>>> I'd also wonder why you bother checking whether the
> >>>> os.path.getsize(file) function returns the same value as the os.SEEK_END
> >>>> and ftell() code does.  Is it that you don't trust the library?  Or that
> >>>> you have to run on Windows, where the line-ending logic can change the
> >>>> apparent file size?
> >>>>
> >>>> I notice you're not specifying a file mode on the open.  So in Python 3,
> >>>> your sizes are going to be specified in unicode characters after
> >>>> decoding.  Is that what the spec says?  It's probably safer to
> >>>> explicitly specify the mode (and the file encoding if you're in text).
> >>>>
> >>>> I see you call strip() before comparing the length.  Could there ever be
> >>>> leading or trailing whitespace that's significant?  Is that the actual
> >>>> specification of line size?
> >>>>
> >>>> --
> >>>> DaveA
> >>>
> >>>
> >>
> >>> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
> >>>
> >>> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
> >>>
> >>
> >>
> >> --
> >> DaveA
> >
> > @DaveA
> >
> > My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.
> 
> Perhaps you don't realize how github works.  The whole point is it 
> preserves the history of your code, and you use the same filename for 
> each revision.
> 
> Or possibly it's I that doesn't understand it.  I use git, but haven't 
> actually used github for my own code.
> 
> >
> > I have attempted to address the functional requirements that you brought up:
> >
> > 1) Before, my present code was assuming all filenames come directly from the commandline.  No searching of a directory. I think that I have addressed this.
> >
> 
> Have you even tried to run the code?  It quits immediately with an 
> exception since your call to main() doesn't pass any arguments, and main 
> requires one.
> 
>  > def main(dirslist):
>  >     while True:
>  >         for file in dirslist:
>  >         	return validate_files(file)
>  >         	time.sleep(5)
> 
> In addition, you aren't actually doing anything to find what the files 
> in the directory are.  I tried to refer to dirlist, as a hint.  A 
> stronger hint:  look up  os.listdir()
> 
> And that list of files has to change each time through the while loop, 
> that's the whole meaning of scanning.  You don't just grab the names 
> once, you look to see what's there.
> 
> The next thing is that you're using a variable called 'file', while 
> that's a built-in type in Python.  So you really want to use a different 
> name.
> 
> Next, you have a loop through the magical dirslist to get individual 
> filenames, but then you call the validate_files() function with a single 
> file, but that function is expecting a list of filenames.  One or the 
> other has to change.
> 
> Next, you return from main after validating the first file, so no others 
> will get processed.
> 
> > 2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.
> 
> You had a function to do that in your very first post to this thread. 
> Have you forgotten it already?  As for syntax errors, you can't expect 
> any help on that when you don't show any code nor the syntax error 
> traceback.  Remember that when a syntax error is shown, it's frequently 
> the previous line or two that actually was wrong.
> 
> >
> > 3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error.
> 
>  > #I wrote an error class in case the I needed such a specification for 
> a notifcation - this is not in the requirements
>  > class ErrorHandler:
>  >
>  >     def __init__(self):
>  >         pass
>  >
>  >     def write(self, string):
>  >
>  >         # write error to file
>  >         fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p") 
> + ").txt"
>  >         handler = open(fname, "w")
>  >         handler.write(string)
>  >         handler.close()
> 
> 
> 
> 
> This looks like Java code.  With no data attributes, and only one 
> method(function), what's the reason you don't just use a function?
> 
> 
> original_stderr = sys.stderr # stored in case want to revert
> sys.stderr = ErrorHandler()
> 
> This is just bogus.  There are rare times when a programmer might want 
> to replace stderr, but that's just unnecessarily confusing in a program 
> like this one.  When you want to write printable data to another file, 
> just use that file's handle as the argument to the file= argument of 
> print().  You could use an instance of ErrorHandler as a file handle. 
> Or do it one of a dozen other straightforward ways.
> 
> While we're at it, why do you keep creating new files for your error 
> report?  And overwriting all the old messages with whatever new ones 
> come in the same minute?
> 
> 
> > (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)
> 
> No idea what you mean here.  What metadata?  And what do copy and copy2 
> have to do with anything here?
> 
> >
> > 4) Before, my present code runs once through the names, and terminates.  It doesn't "monitor" anything. I think I have addressed this with the  main function - correct?
> >
> Not even close.
> 
> > 5) Before, my present code doesn't check for zero-length files  - I have added a comment there in case that is needed)
> >
> > I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
> >
> > Sincerely,
> >
> > Saran
> >
> 
> 
> -- 
> DaveA

@DaveA

Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again. 

from __future__ import print_function

import os
import time
import glob
import sys

def initialize_logger(output_dir):
    logger = logging.getLogger()
    logger.setLevel(logging.DEBUG)
     
    # create console handler and set level to info
    handler = logging.StreamHandler()
    handler.setLevel(logging.INFO)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)
 
    # create error file handler and set level to error
    handler = logging.FileHandler(os.path.join(output_dir, "error.log"),"w", encoding=None, delay="true")
    handler.setLevel(logging.ERROR)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)

    # create debug file handler and set level to debug
    handler = logging.FileHandler(os.path.join(output_dir, "all.log"),"w")
    handler.setLevel(logging.DEBUG)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)


#Helper Functions for the Success and Failure Folder Outcomes, respectively

    def file_len(filename):
        with open(filename) as f:
            for i, l in enumerate(f):
                pass
            return i + 1


    def copy_and_move_File(src, dest):
        try:
            shutil.rename(src, dest)
        # eg. src and dest are the same file
        except shutil.Error as e:
            print('Error: %s' % e)
        # eg. source or destination doesn't exist
        except IOError as e:
            print('Error: %s' % e.strerror)


# Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline.  There was no actual searching #of a directory.  

# I am assuming that this is appropriate since I moved the earlier versions of the files. 
# I let the directory name be the argument to main, and let main do a dirlist each time through the loop, 
# and pass the corresponding list to validate_files. 


path = "/some/sample/path/"
dirlist = os.listdir(path)
before = dict([(f, None) for f in dirlist)
 
#####Syntax Error?     before = dict([(f, None) for f in dirlist)
                                             ^
SyntaxError: invalid syntax

def main(dirlist):
    while True:
        time.sleep(10) #time between update check
    after = dict([(f, None) for f in dirlist)
    added = [f for f in after if not f in before]
    if added:
        print('Sucessfully added new file - ready to validate')
      ####add return statement here to pass to validate_files
if __name__ == "__main__": 
    main() 


#check for record time and record length - logic to be written to either pass to Failure or Success folder respectively

def validate_files():
    creation = time.ctime(os.path.getctime(added))
    lastmod = time.ctime(os.path.getmtime(added))
    
 

#Potential Additions/Substitutions  - what are the implications/consequences for this 

def move_to_failure_folder_and_return_error_file():
    os.mkdir('Failure')
    copy_and_move_File(filename, 'Failure')
    initialize_logger('rootdir/Failure')
    logging.error("Either this file is empty or there are no lines")
     
             
def move_to_success_folder_and_read(f):
    os.mkdir('Success')
    copy_and_move_File(filename, 'Success')
    print("Success", f)
    return file_len()

#This simply checks the file information by name------> is this needed anymore?

def fileinfo(file):
    filename = os.path.basename(f)
    rootdir = os.path.dirname(f)  
    filesize = os.path.getsize(f)
    return filename, rootdir, filesize

if __name__ == '__main__':
   import sys
   validate_files(sys.argv[1:])

# -- end of file

[toc] | [prev] | [next] | [standalone]


#88460

FromDave Angel <davea@davea.name>
Date2015-04-02 17:10 -0400
Message-ID<mailman.23.1428009062.12925.python-list@python.org>
In reply to#88448
On 04/02/2015 09:06 AM, Saran A wrote:

>
> Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again.
>
> from __future__ import print_function

I'll just randomly comment on some things I see here.  You've started 
several threads, on two different forums, so it's impractical to figure 
out what's really up.


    <snip some code I'm not commenting on>
>
> #Helper Functions for the Success and Failure Folder Outcomes, respectively
>
>      def file_len(filename):

This is an indentation error, as you forgot to start at the left margin

>          with open(filename) as f:
>              for i, l in enumerate(f):
>                  pass
>              return i + 1
>
>
>      def copy_and_move_File(src, dest):

ditto

>          try:
>              shutil.rename(src, dest)

Is there a reason you don't use the move function?  rename won't work if 
the two directories aren't on the same file system.

>          # eg. src and dest are the same file
>          except shutil.Error as e:
>              print('Error: %s' % e)
>          # eg. source or destination doesn't exist
>          except IOError as e:
>              print('Error: %s' % e.strerror)
>
>
> # Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline.  There was no actual searching #of a directory.
>
> # I am assuming that this is appropriate since I moved the earlier versions of the files.
> # I let the directory name be the argument to main, and let main do a dirlist each time through the loop,
> # and pass the corresponding list to validate_files.
>
>
> path = "/some/sample/path/"
> dirlist = os.listdir(path)
> before = dict([(f, None) for f in dirlist)
>
> #####Syntax Error?     before = dict([(f, None) for f in dirlist)
>                                               ^
> SyntaxError: invalid syntax

Look at the line in question. There's an unmatched set of brackets.  Not 
that it matters, since you don't need these 2 lines for anything.  See 
my comments on some other forum.

>
> def main(dirlist):

bad name for a directory path variable.

>      while True:
>          time.sleep(10) #time between update check

Somewhere inside this loop, you want to obtain a list of files in the 
specified directory.  And you want to do something with that list.  You 
don't have to worry about what the files were last time, because 
presumably those are gone.  Unless in an unwritten part of the spec, 
you're supposed to abort if any filename is repeated over time.


>      after = dict([(f, None) for f in dirlist)
>      added = [f for f in after if not f in before]
>      if added:
>          print('Sucessfully added new file - ready to validate')
>        ####add return statement here to pass to validate_files
> if __name__ == "__main__":
>      main()

You'll need an argument to call main()

>
>
> #check for record time and record length - logic to be written to either pass to Failure or Success folder respectively
>
> def validate_files():

Where are all the parameters to this function?

>      creation = time.ctime(os.path.getctime(added))
>      lastmod = time.ctime(os.path.getmtime(added))
>
>
>
> #Potential Additions/Substitutions  - what are the implications/consequences for this
>
> def move_to_failure_folder_and_return_error_file():
>      os.mkdir('Failure')
>      copy_and_move_File(filename, 'Failure')
>      initialize_logger('rootdir/Failure')
>      logging.error("Either this file is empty or there are no lines")
>
>
> def move_to_success_folder_and_read(f):
>      os.mkdir('Success')
>      copy_and_move_File(filename, 'Success')
>      print("Success", f)
>      return file_len()
>
> #This simply checks the file information by name------> is this needed anymore?
>
> def fileinfo(file):
>      filename = os.path.basename(f)
>      rootdir = os.path.dirname(f)
>      filesize = os.path.getsize(f)
>      return filename, rootdir, filesize
>
> if __name__ == '__main__':
>     import sys
>     validate_files(sys.argv[1:])
>
> # -- end of file
>


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#88463

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-04-02 16:43 -0700
Message-ID<4d3a3336-5dc1-4fe8-8ef6-1616c6629d01@googlegroups.com>
In reply to#88460
On Thursday, April 2, 2015 at 5:11:20 PM UTC-4, Dave Angel wrote:
> On 04/02/2015 09:06 AM, Saran A wrote:
> 
> >
> > Thanks for your help on this homework assignment. I started from scratch last night. I have added some comments that will perhaps help clarify my intentions and my thought process. Thanks again.
> >
> > from __future__ import print_function
> 
> I'll just randomly comment on some things I see here.  You've started 
> several threads, on two different forums, so it's impractical to figure 
> out what's really up.
> 
> 
>     <snip some code I'm not commenting on>
> >
> > #Helper Functions for the Success and Failure Folder Outcomes, respectively
> >
> >      def file_len(filename):
> 
> This is an indentation error, as you forgot to start at the left margin
> 
> >          with open(filename) as f:
> >              for i, l in enumerate(f):
> >                  pass
> >              return i + 1
> >
> >
> >      def copy_and_move_File(src, dest):
> 
> ditto
> 
> >          try:
> >              shutil.rename(src, dest)
> 
> Is there a reason you don't use the move function?  rename won't work if 
> the two directories aren't on the same file system.
> 
> >          # eg. src and dest are the same file
> >          except shutil.Error as e:
> >              print('Error: %s' % e)
> >          # eg. source or destination doesn't exist
> >          except IOError as e:
> >              print('Error: %s' % e.strerror)
> >
> >
> > # Call main(), with a loop that calls # validate_files(), with a sleep after each pass. Before, my present #code was assuming all filenames come directly from the commandline.  There was no actual searching #of a directory.
> >
> > # I am assuming that this is appropriate since I moved the earlier versions of the files.
> > # I let the directory name be the argument to main, and let main do a dirlist each time through the loop,
> > # and pass the corresponding list to validate_files.
> >
> >
> > path = "/some/sample/path/"
> > dirlist = os.listdir(path)
> > before = dict([(f, None) for f in dirlist)
> >
> > #####Syntax Error?     before = dict([(f, None) for f in dirlist)
> >                                               ^
> > SyntaxError: invalid syntax
> 
> Look at the line in question. There's an unmatched set of brackets.  Not 
> that it matters, since you don't need these 2 lines for anything.  See 
> my comments on some other forum.
> 
> >
> > def main(dirlist):
> 
> bad name for a directory path variable.
> 
> >      while True:
> >          time.sleep(10) #time between update check
> 
> Somewhere inside this loop, you want to obtain a list of files in the 
> specified directory.  And you want to do something with that list.  You 
> don't have to worry about what the files were last time, because 
> presumably those are gone.  Unless in an unwritten part of the spec, 
> you're supposed to abort if any filename is repeated over time.
> 
> 
> >      after = dict([(f, None) for f in dirlist)
> >      added = [f for f in after if not f in before]
> >      if added:
> >          print('Sucessfully added new file - ready to validate')
> >        ####add return statement here to pass to validate_files
> > if __name__ == "__main__":
> >      main()
> 
> You'll need an argument to call main()
> 
> >
> >
> > #check for record time and record length - logic to be written to either pass to Failure or Success folder respectively
> >
> > def validate_files():
> 
> Where are all the parameters to this function?
> 
> >      creation = time.ctime(os.path.getctime(added))
> >      lastmod = time.ctime(os.path.getmtime(added))
> >
> >
> >
> > #Potential Additions/Substitutions  - what are the implications/consequences for this
> >
> > def move_to_failure_folder_and_return_error_file():
> >      os.mkdir('Failure')
> >      copy_and_move_File(filename, 'Failure')
> >      initialize_logger('rootdir/Failure')
> >      logging.error("Either this file is empty or there are no lines")
> >
> >
> > def move_to_success_folder_and_read(f):
> >      os.mkdir('Success')
> >      copy_and_move_File(filename, 'Success')
> >      print("Success", f)
> >      return file_len()
> >
> > #This simply checks the file information by name------> is this needed anymore?
> >
> > def fileinfo(file):
> >      filename = os.path.basename(f)
> >      rootdir = os.path.dirname(f)
> >      filesize = os.path.getsize(f)
> >      return filename, rootdir, filesize
> >
> > if __name__ == '__main__':
> >     import sys
> >     validate_files(sys.argv[1:])
> >
> > # -- end of file
> >
> 
> 
> -- 
> DaveA

@DaveA

I debugged and rewrote everything. Here is the full version. Feel free to tear this apart. The homework assignment is not due until tomorrow, so I am currently also experimenting with pyinotify as well. I do have questions regarding how to make this function compatible with the ProcessEvent Class. I will create another post for this. 

What would you advise in regards to renaming the inaptly named dirlist?

# # # Without data to examine here, I can only guess based on this requirement's language that 
# # fixed records are in the input.  If so, here's a slight revision to the helper functions that I wrote earlier which
# # takes the function fileinfo as a starting point and demonstrates calling a function from within a function.  
# I tested this little sample on a small set of files created with MD5 checksums.  I wrote the Python in such a way as it 
# would work with Python 2.x or 3.x (note the __future__ at the top).

# # # There are so many wonderful ways of failure, so, from a development standpoint, I would probably spend a bit 
# # more time trying to determine which failure(s) I would want to report to the user, and how (perhaps creating my own Exceptions)

# # # The only other comments I would make are about safe-file handling.

# # #   #1:  Question: After a user has created a file that has failed (in
# # #        processing),can the user create a file with the same name?
# # #        If so, then you will probably want to look at some sort
# # #        of file-naming strategy to avoid overwriting evidence of
# # #        earlier failures.

# # # File naming is a tricky thing.  I referenced the tempfile module [1] and the Maildir naming scheme to see two different 
# # types of solutions to the problem of choosing a unique filename.

## I am assuming that all of my files are going to be specified in unicode  

## Utilized Spyder's Scientific Computing IDE to debug, check for indentation errors and test function suite

from __future__ import print_function

import os.path
import time
import logging


def initialize_logger(output_dir):
    logger = logging.getLogger()
    logger.setLevel(logging.DEBUG)
     
    # create console handler and set level to info
    handler = logging.StreamHandler()
    handler.setLevel(logging.INFO)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)
 
    # create error file handler and set level to error
    handler = logging.FileHandler(os.path.join(output_dir, "error.log"),"w", encoding=None, delay="true")
    handler.setLevel(logging.ERROR)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)

    # create debug file handler and set level to debug
    handler = logging.FileHandler(os.path.join(output_dir, "all.log"),"w")
    handler.setLevel(logging.DEBUG)
    formatter = logging.Formatter("%(levelname)s - %(message)s")
    handler.setFormatter(formatter)
    logger.addHandler(handler)


#Returns filename, rootdir and filesize 

def fileinfo(f):
    filename = os.path.basename(f)
    rootdir = os.path.dirname(f)  
    filesize = os.path.getsize(f)
    return filename, rootdir, filesize

#returns length of file
def file_len(f):
    with open(f) as f:
        for i, l in enumerate(f):
            pass
            return i + 1
#attempts to copy file and move file to it's directory
def copy_and_move_file(src, dest):
    try:
        os.rename(src, dest)
        # eg. src and dest are the same file
    except IOError as e:
        print('Error: %s' % e.strerror)

path = "."
dirlist = os.listdir(path)
 
def main(dirlist):   
    before = dict([(f, 0) for f in dirlist])
    while True:
        time.sleep(1) #time between update check
    after = dict([(f, None) for f in dirlist])
    added = [f for f in after if not f in before]
    if added:
        f = ''.join(added)
        print('Sucessfully added %s file - ready to validate') %()
        return validate_files(f)
    else:
        return move_to_failure_folder_and_return_error_file(f)


    
def validate_files(f):
    creation = time.ctime(os.path.getctime(f))
    lastmod = time.ctime(os.path.getmtime(f))
    if creation == lastmod and file_len(f) > 0:
        return move_to_success_folder_and_read(f)
    if file_len < 0 and creation != lastmod:
        return move_to_success_folder_and_read(f)
    else:
        return move_to_failure_folder_and_return_error_file(f)


#Potential Additions/Substitutions

def move_to_failure_folder_and_return_error_file():
    filename, rootdir, lastmod, creation, filesize = fileinfo(file)  
    os.mkdir('Failure')
    copy_and_move_file( 'Failure')
    initialize_logger('rootdir/Failure')
    logging.error("Either this file is empty or there are no lines")
     
             
def move_to_success_folder_and_read():
    filename, rootdir, lastmod, creation, filesize = fileinfo(file)  
    os.mkdir('Success')
    copy_and_move_file(rootdir, 'Success') #file name
    print("Success", file)
    return file_len(file)



if __name__ == '__main__':
   main(dirlist)

[toc] | [prev] | [next] | [standalone]


#88476

FromPeter Otten <__peter__@web.de>
Date2015-04-03 12:45 +0200
Message-ID<mailman.29.1428057967.12925.python-list@python.org>
In reply to#88463
Saran A wrote:

> I debugged and rewrote everything. Here is the full version. Feel free to
> tear this apart. The homework assignment is not due until tomorrow, so I
> am currently also experimenting with pyinotify as well.

Saran, try to make a realistic assessment of your capability. Your 
"debugged" code still has this gem

>    while True:
>        time.sleep(1) #time between update check

and you are likely to miss your deadline anyway.

While the inotify approach may be the "right way" to attack this problem 
from the point of view of an experienced developer like Chris as a newbie 
you should stick to the simplest thing that can possibly work.

In a previous post I mentioned unit tests, but even an ad-hoc test in the 
interactive interpreter will show that your file_len() function doesn't work

> #returns length of file
> def file_len(f):
>     with open(f) as f:
>         for i, l in enumerate(f):
>             pass
>             return i + 1
> 

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> #returns length of file
... def file_len(f):
...     with open(f) as f:
...         for i, l in enumerate(f):
...             pass
...             return i + 1
... 
>>> with open("tmp.txt", "w") as f: f.write("abcde")
... 
>>> file_len("tmp.txt")
1

It reports size 1 for every file. 

A simple unit test would look like this (assuming watchscript.py is the name 
of the script to be tested):

import unittest

from watchscript import file_len

class FileLen(unittest.TestCase):
    def test_five(self):
        LEN = 5
        FILENAME = "tmp.txt"
        with open(FILENAME, "w") as f:
            f.write("*" * LEN)

        self.assertEqual(file_len(FILENAME), 5)
        
if __name__ == "__main__":
    unittest.main()

Typically for every tested function you add a new TestCase subclass and for 
every test you perform for that function you add a test_...() method to that 
class. The unittest.main() will collect these methods, run them and generate 
a report. For the above example it will complain:

$ python test_watchscript.py 
F
======================================================================
FAIL: test_five (__main__.FileLen)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_watchscript.py", line 12, in test_five
    self.assertEqual(file_len(FILENAME), 5)
AssertionError: 1 != 5

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)


The file_len() function is but an example of the many problems with your 
code. There is no shortcut, you have to decide for every function what 
exactly it should do and then verify that it actually does what it is meant 
to do. Try to negotiate an extra week or two with your supervisor and then 
start small:

On day one make a script that moves all files from one directory to another.
Make sure that the source and destination directory are only stored in one 
place in your script so that they can easily be replaced. Run the script 
after every change and devise tests that demonstrate that it works 
correctly.

On day two make a script that checks all files in a directory. Put the 
checks into a function that takes a file path as its only parameter.
Write a few correct and a few broken files and test that only the correct 
ones are recognized as correct, and only the broken ones are flagged as 
broken, and that all files in the directory are checked.

On day three make a script that combines the above and only moves the 
correct files into another directory. Devise tests to verify that both 
directories contain the expected files before and after your script is 
executed.

On day four make a script that also moves the bad files into another 
directory. Modify your tests from the previous day to check the contents of 
all three directories.

On day five wrap your previous efforts in a loop that runs forever.
I seems to work? You are not done. Write tests that demonstrate that it does 
work. Try to think of the corner cases: what if there are no new files on 
one iteration? What if there is a new file with the same name as a previous 
one? Don't handwave, describe the reaction of your script in plain English 
with all the gory details and then translate into Python.

Heureka!

Note that a "day" in the above outline can be 15 minutes or a week. You're 
done when you're done. Also note that the amount of code you have written 
bears no indication of how close you are to your goal. An experienced 
programmer will end up with less code than you to fulfill the same spec. 
That shouldn't bother you. On the other hand you should remove code that 
doesn't contribute to your script's goal immediately. If you keep failed 
attempts and side tracks your code will become harder and harder to 
maintain, and that's the last thing you need when it's already hard to get 
the necessary parts right. 

Good luck!

[toc] | [prev] | [next] | [standalone]


#88482

FromSaran A <ahlusar.ahluwalia@gmail.com>
Date2015-04-03 06:40 -0700
Message-ID<149a7683-15a0-4fa6-882f-4ff8982e9e0c@googlegroups.com>
In reply to#88476
On Friday, April 3, 2015 at 6:46:21 AM UTC-4, Peter Otten wrote:
> Saran A wrote:
> 
> > I debugged and rewrote everything. Here is the full version. Feel free to
> > tear this apart. The homework assignment is not due until tomorrow, so I
> > am currently also experimenting with pyinotify as well.
> 
> Saran, try to make a realistic assessment of your capability. Your 
> "debugged" code still has this gem
> 
> >    while True:
> >        time.sleep(1) #time between update check
> 
> and you are likely to miss your deadline anyway.
> 
> While the inotify approach may be the "right way" to attack this problem 
> from the point of view of an experienced developer like Chris as a newbie 
> you should stick to the simplest thing that can possibly work.
> 
> In a previous post I mentioned unit tests, but even an ad-hoc test in the 
> interactive interpreter will show that your file_len() function doesn't work
> 
> > #returns length of file
> > def file_len(f):
> >     with open(f) as f:
> >         for i, l in enumerate(f):
> >             pass
> >             return i + 1
> > 
> 
> $ python
> Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> #returns length of file
> ... def file_len(f):
> ...     with open(f) as f:
> ...         for i, l in enumerate(f):
> ...             pass
> ...             return i + 1
> ... 
> >>> with open("tmp.txt", "w") as f: f.write("abcde")
> ... 
> >>> file_len("tmp.txt")
> 1
> 
> It reports size 1 for every file. 
> 
> A simple unit test would look like this (assuming watchscript.py is the name 
> of the script to be tested):
> 
> import unittest
> 
> from watchscript import file_len
> 
> class FileLen(unittest.TestCase):
>     def test_five(self):
>         LEN = 5
>         FILENAME = "tmp.txt"
>         with open(FILENAME, "w") as f:
>             f.write("*" * LEN)
> 
>         self.assertEqual(file_len(FILENAME), 5)
>         
> if __name__ == "__main__":
>     unittest.main()
> 
> Typically for every tested function you add a new TestCase subclass and for 
> every test you perform for that function you add a test_...() method to that 
> class. The unittest.main() will collect these methods, run them and generate 
> a report. For the above example it will complain:
> 
> $ python test_watchscript.py 
> F
> ======================================================================
> FAIL: test_five (__main__.FileLen)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "test_watchscript.py", line 12, in test_five
>     self.assertEqual(file_len(FILENAME), 5)
> AssertionError: 1 != 5
> 
> ----------------------------------------------------------------------
> Ran 1 test in 0.001s
> 
> FAILED (failures=1)
> 
> 
> The file_len() function is but an example of the many problems with your 
> code. There is no shortcut, you have to decide for every function what 
> exactly it should do and then verify that it actually does what it is meant 
> to do. Try to negotiate an extra week or two with your supervisor and then 
> start small:
> 
> On day one make a script that moves all files from one directory to another.
> Make sure that the source and destination directory are only stored in one 
> place in your script so that they can easily be replaced. Run the script 
> after every change and devise tests that demonstrate that it works 
> correctly.
> 
> On day two make a script that checks all files in a directory. Put the 
> checks into a function that takes a file path as its only parameter.
> Write a few correct and a few broken files and test that only the correct 
> ones are recognized as correct, and only the broken ones are flagged as 
> broken, and that all files in the directory are checked.
> 
> On day three make a script that combines the above and only moves the 
> correct files into another directory. Devise tests to verify that both 
> directories contain the expected files before and after your script is 
> executed.
> 
> On day four make a script that also moves the bad files into another 
> directory. Modify your tests from the previous day to check the contents of 
> all three directories.
> 
> On day five wrap your previous efforts in a loop that runs forever.
> I seems to work? You are not done. Write tests that demonstrate that it does 
> work. Try to think of the corner cases: what if there are no new files on 
> one iteration? What if there is a new file with the same name as a previous 
> one? Don't handwave, describe the reaction of your script in plain English 
> with all the gory details and then translate into Python.
> 
> Heureka!
> 
> Note that a "day" in the above outline can be 15 minutes or a week. You're 
> done when you're done. Also note that the amount of code you have written 
> bears no indication of how close you are to your goal. An experienced 
> programmer will end up with less code than you to fulfill the same spec. 
> That shouldn't bother you. On the other hand you should remove code that 
> doesn't contribute to your script's goal immediately. If you keep failed 
> attempts and side tracks your code will become harder and harder to 
> maintain, and that's the last thing you need when it's already hard to get 
> the necessary parts right. 
> 
> Good luck!

@Peter: Thank you emphasizing unit testing. I will be sure to be more mindful regarding this moving forward. FYI I addressed those "gems"   :) It's amazing what some sleep can do. 

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web