Groups > comp.lang.python > #95938 > unrolled thread

continue vs. pass in this IO reading and writing

Started by	kbtyo <ahlusar.ahluwalia@gmail.com>
First post	2015-09-03 08:05 -0700
Last post	2015-09-03 18:37 +0200
Articles	10 — 4 participants

Back to article view | Back to comp.lang.python

  continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:05 -0700
    Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 01:27 +1000
      Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:38 -0700
        Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 01:51 +1000
          Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:57 -0700
            Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 02:11 +1000
              Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 09:35 -0700
      Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:49 -0700
    Re: continue vs. pass in this IO reading and writing Terry Reedy <tjreedy@udel.edu> - 2015-09-03 12:05 -0400
    Re: continue vs. pass in this IO reading and writing Luca Menegotto <otlucaDELETE@DELETEyahoo.it> - 2015-09-03 18:37 +0200

#95938 — continue vs. pass in this IO reading and writing

From	kbtyo <ahlusar.ahluwalia@gmail.com>
Date	2015-09-03 08:05 -0700
Subject	continue vs. pass in this IO reading and writing
Message-ID	<19ca6361-95fe-4a5d-84d6-c72d7941745c@googlegroups.com>

Good Morning:

I am experimenting with many exception handling and utilizing continue vs pass. After pouring over a lot of material on SO and other forums I am still unclear as to the difference when setting variables and applying functions within multiple "for" loops. 

Specifically, I understand that the general format in the case of pass and using else is the following:

try:
      doSomething()
except Exception: 
    pass
else:
      stuffDoneIf()
      TryClauseSucceeds()

However, I am uncertain as to how this executes in a context like this:

import glob
import csv
from collections import OrderedDict

interesting_files = glob.glob("*.csv") 

header_saved = False
with open('merged_output_mod.csv','w') as fout:

    for filename in interesting_files:
        print("execution here again")
        with open(filename) as fin:
            try:
                header = next(fin)
                print("Entering Try and Except")
            except:
                StopIteration
                continue
            else:
                if not header_saved:
                    fout.write(header)
                    header_saved = True
                    print("We got here")
                for line in fin:
                    fout.write(line)

My questions are (for some reason my interpreter does not print out any readout):

1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?

2. How would a pass behave in this situation?

Thanks for your feedback. 

Sincerely,

Saran

[toc] | [next] | [standalone]

#95942

From	Chris Angelico <rosuav@gmail.com>
Date	2015-09-04 01:27 +1000
Message-ID	<mailman.70.1441294042.8327.python-list@python.org>
In reply to	#95938

On Fri, Sep 4, 2015 at 1:05 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> However, I am uncertain as to how this executes in a context like this:
>
> import glob
> import csv
> from collections import OrderedDict
>
> interesting_files = glob.glob("*.csv")
>
> header_saved = False
> with open('merged_output_mod.csv','w') as fout:
>
>     for filename in interesting_files:
>         print("execution here again")
>         with open(filename) as fin:
>             try:
>                 header = next(fin)
>                 print("Entering Try and Except")
>             except:
>                 StopIteration
>                 continue

I think what you want here is:

except StopIteration:
    continue

The code you have will catch _any_ exception, and then look up the
name StopIteration (and discard it).

>             else:
>                 if not header_saved:
>                     fout.write(header)
>                     header_saved = True
>                     print("We got here")
>                 for line in fin:
>                     fout.write(line)
>
> My questions are (for some reason my interpreter does not print out any readout):
>
> 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
>
> 2. How would a pass behave in this situation?

The continue statement means "skip the rest of this loop's body and go
to the next iteration of the loop, if there is one". In this case,
there's no further body, so it's going to be the same as "pass" (which
means "do nothing").

For the rest, I think your code should be broadly functional. Of
course, it assumes that your files all have compatible headers, but
presumably you know that that's safe.

ChrisA

[toc] | [prev] | [next] | [standalone]

#95944

From	kbtyo <ahlusar.ahluwalia@gmail.com>
Date	2015-09-03 08:38 -0700
Message-ID	<53c5301c-2833-446f-a7d1-6c0ef9314928@googlegroups.com>
In reply to	#95942

On Thursday, September 3, 2015 at 11:27:58 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:05 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> > However, I am uncertain as to how this executes in a context like this:
> >
> > import glob
> > import csv
> > from collections import OrderedDict
> >
> > interesting_files = glob.glob("*.csv")
> >
> > header_saved = False
> > with open('merged_output_mod.csv','w') as fout:
> >
> >     for filename in interesting_files:
> >         print("execution here again")
> >         with open(filename) as fin:
> >             try:
> >                 header = next(fin)
> >                 print("Entering Try and Except")
> >             except:
> >                 StopIteration
> >                 continue
> 
> I think what you want here is:
> 
> except StopIteration:
>     continue
> 
> The code you have will catch _any_ exception, and then look up the
> name StopIteration (and discard it).
> 
> >             else:
> >                 if not header_saved:
> >                     fout.write(header)
> >                     header_saved = True
> >                     print("We got here")
> >                 for line in fin:
> >                     fout.write(line)
> >
> > My questions are (for some reason my interpreter does not print out any readout):
> >
> > 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
> >
> > 2. How would a pass behave in this situation?
> 
> The continue statement means "skip the rest of this loop's body and go
> to the next iteration of the loop, if there is one". In this case,
> there's no further body, so it's going to be the same as "pass" (which
> means "do nothing").
> 
> For the rest, I think your code should be broadly functional. Of
> course, it assumes that your files all have compatible headers, but
> presumably you know that that's safe.
> 
> ChrisA

Hi ChrisA:

Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which 
means "do nothing")") that the else block is not entered. For exma

Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).

[toc] | [prev] | [next] | [standalone]

#95948

From	Chris Angelico <rosuav@gmail.com>
Date	2015-09-04 01:51 +1000
Message-ID	<mailman.75.1441295521.8327.python-list@python.org>
In reply to	#95944

On Fri, Sep 4, 2015 at 1:38 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which
> means "do nothing")") that the else block is not entered. For exma

Seems like a cut-off paragraph here, but yes. In a try/except/else
block, the 'else' block executes only if the 'try' didn't raise an
exception of the specified type(s).

> Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).
>

Your algorithm is basically: Take the entire first file, including its
header, and then append all other files after skipping their first
lines. If you want a smarter form of CSV merge, I would recommend
using the 'csv' module, and probably doing a quick check of all files
before you begin, so as to collect up the full set of headers. That'll
also save you the hassle of playing around with StopIteration as you
read in the headers.

ChrisA

[toc] | [prev] | [next] | [standalone]

#95951

From	kbtyo <ahlusar.ahluwalia@gmail.com>
Date	2015-09-03 08:57 -0700
Message-ID	<652bbe97-aef5-41dc-8f4c-cfa47bcfd120@googlegroups.com>
In reply to	#95948

On Thursday, September 3, 2015 at 11:52:16 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:38 AM, kbtyo wrote:
> > Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which
> > means "do nothing")") that the else block is not entered. For exma
> 
> Seems like a cut-off paragraph here, but yes. In a try/except/else
> block, the 'else' block executes only if the 'try' didn't raise an
> exception of the specified type(s).
> 
> > Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).
> >
> 
> Your algorithm is basically: Take the entire first file, including its
> header, and then append all other files after skipping their first
> lines. If you want a smarter form of CSV merge, I would recommend
> using the 'csv' module, and probably doing a quick check of all files
> before you begin, so as to collect up the full set of headers. That'll
> also save you the hassle of playing around with StopIteration as you
> read in the headers.
> 
> ChrisA


I have files that may have different headers. If they are different, they should be appended (along with their values). If there are duplicate headers, then their values should just be added. 

I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):

from glob import iglob
import csv
from collections import OrderedDict

files = sorted(iglob('*.csv'))
header = OrderedDict()
data = []

for filename in files:
    with open(filename, 'r') as fin:
        csvin = csv.DictReader(fin)
        header.update(OrderedDict.fromkeys(csvin.fieldnames))
        data.append(next(csvin))

with open('output_filename_version2.csv', 'w') as fout:
    csvout = csv.DictWriter(fout, fieldnames=list(header))
    csvout.writeheader()
    csvout.writerows(data)

[toc] | [prev] | [next] | [standalone]

#95953

From	Chris Angelico <rosuav@gmail.com>
Date	2015-09-04 02:11 +1000
Message-ID	<mailman.80.1441296690.8327.python-list@python.org>
In reply to	#95951

On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
>
> from glob import iglob
> import csv
> from collections import OrderedDict
>
> files = sorted(iglob('*.csv'))
> header = OrderedDict()
> data = []
>
> for filename in files:
>     with open(filename, 'r') as fin:
>         csvin = csv.DictReader(fin)
>         header.update(OrderedDict.fromkeys(csvin.fieldnames))
>         data.append(next(csvin))
>
> with open('output_filename_version2.csv', 'w') as fout:
>     csvout = csv.DictWriter(fout, fieldnames=list(header))
>     csvout.writeheader()
>     csvout.writerows(data)

You're collecting up just one row from each file. Since you say your
input is measured in MB (not GB or anything bigger), the simplest
approach is probably fine: instead of "data.append(next(csvin))", just
use "data.extend(csvin)", which should grab them all. That'll store
all your input data in memory, which should be fine if it's only a few
meg, and probably not a problem for anything under a few hundred meg.

ChrisA

[toc] | [prev] | [next] | [standalone]

#95960

From	kbtyo <ahlusar.ahluwalia@gmail.com>
Date	2015-09-03 09:35 -0700
Message-ID	<6abd12de-2b10-4cc9-86db-a2f144ecc56a@googlegroups.com>
In reply to	#95953

On Thursday, September 3, 2015 at 12:12:04 PM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> > I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
> >
> > from glob import iglob
> > import csv
> > from collections import OrderedDict
> >
> > files = sorted(iglob('*.csv'))
> > header = OrderedDict()
> > data = []
> >
> > for filename in files:
> >     with open(filename, 'r') as fin:
> >         csvin = csv.DictReader(fin)
> >         header.update(OrderedDict.fromkeys(csvin.fieldnames))
> >         data.append(next(csvin))
> >
> > with open('output_filename_version2.csv', 'w') as fout:
> >     csvout = csv.DictWriter(fout, fieldnames=list(header))
> >     csvout.writeheader()
> >     csvout.writerows(data)
> 
> You're collecting up just one row from each file. Since you say your
> input is measured in MB (not GB or anything bigger), the simplest
> approach is probably fine: instead of "data.append(next(csvin))", just
> use "data.extend(csvin)", which should grab them all. That'll store
> all your input data in memory, which should be fine if it's only a few
> meg, and probably not a problem for anything under a few hundred meg.
> 
> ChrisA

Hmmmm - good point. However, I may have to deal with larger files, but thank you for the tip. 

I am also wondering, based on what you stated, you are only "collecting up just one row from each file"....

I am fulfilling this, correct? 

"I have files that may have different headers. If they are different, they should be appended (along with their values) into the output. If there are duplicate headers, then their values should just be added sequentially."

I am wondering how DictReader can skip empty rows by default and that this may be happening that also extrapolates to the other rows.

[toc] | [prev] | [next] | [standalone]

#95946

From	kbtyo <ahlusar.ahluwalia@gmail.com>
Date	2015-09-03 08:49 -0700
Message-ID	<3be8f66b-38b8-4e32-ae8e-4ac613560677@googlegroups.com>
In reply to	#95942

On Thursday, September 3, 2015 at 11:27:58 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:05 AM, kbtyo wrote:
> > However, I am uncertain as to how this executes in a context like this:
> >
> > import glob
> > import csv
> > from collections import OrderedDict
> >
> > interesting_files = glob.glob("*.csv")
> >
> > header_saved = False
> > with open('merged_output_mod.csv','w') as fout:
> >
> >     for filename in interesting_files:
> >         print("execution here again")
> >         with open(filename) as fin:
> >             try:
> >                 header = next(fin)
> >                 print("Entering Try and Except")
> >             except:
> >                 StopIteration
> >                 continue
> 
> I think what you want here is:
> 
> except StopIteration:
>     continue
> 
> The code you have will catch _any_ exception, and then look up the
> name StopIteration (and discard it).
> 
> >             else:
> >                 if not header_saved:
> >                     fout.write(header)
> >                     header_saved = True
> >                     print("We got here")
> >                 for line in fin:
> >                     fout.write(line)
> >
> > My questions are (for some reason my interpreter does not print out any readout):
> >
> > 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
> >
> > 2. How would a pass behave in this situation?
> 
> The continue statement means "skip the rest of this loop's body and go
> to the next iteration of the loop, if there is one". In this case,
> there's no further body, so it's going to be the same as "pass" (which
> means "do nothing").


So what I hear you saying is I am not entering the else" block? Hence, when each file is read, the rest of the suite is not applied - specifically, 

   if not header_saved:
       fout.write(header)
       header_saved = True
       print("We got here")

> 
> For the rest, I think your code should be broadly functional. Of
> course, it assumes that your files all have compatible headers, but
> presumably you know that that's safe.
> 
> ChrisA

Would you mind elaborating on what you meant by "compatible headers"? I have files that may have different headers. If they are different, they should be appended (along with their values). If there are duplicate headers, then their values should just be added.

[toc] | [prev] | [next] | [standalone]

#95952

From	Terry Reedy <tjreedy@udel.edu>
Date	2015-09-03 12:05 -0400
Message-ID	<mailman.79.1441296357.8327.python-list@python.org>
In reply to	#95938

On 9/3/2015 11:05 AM, kbtyo wrote:

> I am experimenting with many exception handling and utilizing continue vs pass.

'pass' is a do-nothing place holder.  'continue' and 'break' are jump 
statements

[snip]

> However, I am uncertain as to how this executes in a context like this:
>
> import glob
> import csv
> from collections import OrderedDict
>
> interesting_files = glob.glob("*.csv")
>
> header_saved = False
> with open('merged_output_mod.csv','w') as fout:
>
>      for filename in interesting_files:
>          print("execution here again")
>          with open(filename) as fin:
>              try:
>                  header = next(fin)
>                  print("Entering Try and Except")
>              except:
>                  StopIteration
>                  continue
>              else:
>                  if not header_saved:
>                      fout.write(header)
>                      header_saved = True
>                      print("We got here")
>                  for line in fin:
>                      fout.write(line)
>
> My questions are (for some reason my interpreter does not print out any readout):
>
> 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
>
> 2. How would a pass behave in this situation?

Try it for yourself.  Copy the following into a python shell or editor 
(and run) see what you get.

for i in [-1, 0, 1]:
     try:
         j = 2//i
     except ZeroDivisionError:
         print('infinity')
         continue
     else:
         print(j)

Change 'continue' to 'pass' and run again.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#95961

From	Luca Menegotto <otlucaDELETE@DELETEyahoo.it>
Date	2015-09-03 18:37 +0200
Message-ID	<ms9t12$ubn$1@speranza.aioe.org>
In reply to	#95938

Il 03/09/2015 17:05, kbtyo ha scritto:

> I am experimenting with many exception handling and utilizing
 > continue vs pass. After pouring over a lot of material on SO
 > and other forums I am still unclear as to the difference when
 > setting variables and applying functions within multiple "for"
 > loops.

'pass' and 'continue' have two very different meanings.

'pass' means 'don't do anything'; it's useful when you _have_ to put a 
statement but you _don't_need_ to put a statement.
You can use it everywhere you want, with no other damage then adding a 
little weight to your code.

A stupid example:

if i == 0:
    pass
else:
    do_something()


'continue', to be used in a loop (for or while) means 'ignore the rest 
of the code and go immediatly to the next iteration'. The statement 
refers to the nearest loop; so, if you have two nested loops, it refers 
to the inner one; another stupid example:

for i in range(10):
     for j in range(10):
         if j < 5: continue
         do_something(i, j) # called only if j >= 5

-- 
Ciao!
Luca

[toc] | [prev] | [standalone]

csiph-web

continue vs. pass in this IO reading and writing

Contents

#95938 — continue vs. pass in this IO reading and writing

#95942

#95944

#95948

#95951

#95953

#95960

#95946

#95952

#95961