Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #95938 > unrolled thread
| Started by | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| First post | 2015-09-03 08:05 -0700 |
| Last post | 2015-09-03 18:37 +0200 |
| Articles | 10 — 4 participants |
Back to article view | Back to comp.lang.python
continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:05 -0700
Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 01:27 +1000
Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:38 -0700
Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 01:51 +1000
Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:57 -0700
Re: continue vs. pass in this IO reading and writing Chris Angelico <rosuav@gmail.com> - 2015-09-04 02:11 +1000
Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 09:35 -0700
Re: continue vs. pass in this IO reading and writing kbtyo <ahlusar.ahluwalia@gmail.com> - 2015-09-03 08:49 -0700
Re: continue vs. pass in this IO reading and writing Terry Reedy <tjreedy@udel.edu> - 2015-09-03 12:05 -0400
Re: continue vs. pass in this IO reading and writing Luca Menegotto <otlucaDELETE@DELETEyahoo.it> - 2015-09-03 18:37 +0200
| From | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-09-03 08:05 -0700 |
| Subject | continue vs. pass in this IO reading and writing |
| Message-ID | <19ca6361-95fe-4a5d-84d6-c72d7941745c@googlegroups.com> |
Good Morning:
I am experimenting with many exception handling and utilizing continue vs pass. After pouring over a lot of material on SO and other forums I am still unclear as to the difference when setting variables and applying functions within multiple "for" loops.
Specifically, I understand that the general format in the case of pass and using else is the following:
try:
doSomething()
except Exception:
pass
else:
stuffDoneIf()
TryClauseSucceeds()
However, I am uncertain as to how this executes in a context like this:
import glob
import csv
from collections import OrderedDict
interesting_files = glob.glob("*.csv")
header_saved = False
with open('merged_output_mod.csv','w') as fout:
for filename in interesting_files:
print("execution here again")
with open(filename) as fin:
try:
header = next(fin)
print("Entering Try and Except")
except:
StopIteration
continue
else:
if not header_saved:
fout.write(header)
header_saved = True
print("We got here")
for line in fin:
fout.write(line)
My questions are (for some reason my interpreter does not print out any readout):
1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
2. How would a pass behave in this situation?
Thanks for your feedback.
Sincerely,
Saran
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-09-04 01:27 +1000 |
| Message-ID | <mailman.70.1441294042.8327.python-list@python.org> |
| In reply to | #95938 |
On Fri, Sep 4, 2015 at 1:05 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> However, I am uncertain as to how this executes in a context like this:
>
> import glob
> import csv
> from collections import OrderedDict
>
> interesting_files = glob.glob("*.csv")
>
> header_saved = False
> with open('merged_output_mod.csv','w') as fout:
>
> for filename in interesting_files:
> print("execution here again")
> with open(filename) as fin:
> try:
> header = next(fin)
> print("Entering Try and Except")
> except:
> StopIteration
> continue
I think what you want here is:
except StopIteration:
continue
The code you have will catch _any_ exception, and then look up the
name StopIteration (and discard it).
> else:
> if not header_saved:
> fout.write(header)
> header_saved = True
> print("We got here")
> for line in fin:
> fout.write(line)
>
> My questions are (for some reason my interpreter does not print out any readout):
>
> 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
>
> 2. How would a pass behave in this situation?
The continue statement means "skip the rest of this loop's body and go
to the next iteration of the loop, if there is one". In this case,
there's no further body, so it's going to be the same as "pass" (which
means "do nothing").
For the rest, I think your code should be broadly functional. Of
course, it assumes that your files all have compatible headers, but
presumably you know that that's safe.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-09-03 08:38 -0700 |
| Message-ID | <53c5301c-2833-446f-a7d1-6c0ef9314928@googlegroups.com> |
| In reply to | #95942 |
On Thursday, September 3, 2015 at 11:27:58 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:05 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> > However, I am uncertain as to how this executes in a context like this:
> >
> > import glob
> > import csv
> > from collections import OrderedDict
> >
> > interesting_files = glob.glob("*.csv")
> >
> > header_saved = False
> > with open('merged_output_mod.csv','w') as fout:
> >
> > for filename in interesting_files:
> > print("execution here again")
> > with open(filename) as fin:
> > try:
> > header = next(fin)
> > print("Entering Try and Except")
> > except:
> > StopIteration
> > continue
>
> I think what you want here is:
>
> except StopIteration:
> continue
>
> The code you have will catch _any_ exception, and then look up the
> name StopIteration (and discard it).
>
> > else:
> > if not header_saved:
> > fout.write(header)
> > header_saved = True
> > print("We got here")
> > for line in fin:
> > fout.write(line)
> >
> > My questions are (for some reason my interpreter does not print out any readout):
> >
> > 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
> >
> > 2. How would a pass behave in this situation?
>
> The continue statement means "skip the rest of this loop's body and go
> to the next iteration of the loop, if there is one". In this case,
> there's no further body, so it's going to be the same as "pass" (which
> means "do nothing").
>
> For the rest, I think your code should be broadly functional. Of
> course, it assumes that your files all have compatible headers, but
> presumably you know that that's safe.
>
> ChrisA
Hi ChrisA:
Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which
means "do nothing")") that the else block is not entered. For exma
Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-09-04 01:51 +1000 |
| Message-ID | <mailman.75.1441295521.8327.python-list@python.org> |
| In reply to | #95944 |
On Fri, Sep 4, 2015 at 1:38 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote: > Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which > means "do nothing")") that the else block is not entered. For exma Seems like a cut-off paragraph here, but yes. In a try/except/else block, the 'else' block executes only if the 'try' didn't raise an exception of the specified type(s). > Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only). > Your algorithm is basically: Take the entire first file, including its header, and then append all other files after skipping their first lines. If you want a smarter form of CSV merge, I would recommend using the 'csv' module, and probably doing a quick check of all files before you begin, so as to collect up the full set of headers. That'll also save you the hassle of playing around with StopIteration as you read in the headers. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-09-03 08:57 -0700 |
| Message-ID | <652bbe97-aef5-41dc-8f4c-cfa47bcfd120@googlegroups.com> |
| In reply to | #95948 |
On Thursday, September 3, 2015 at 11:52:16 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:38 AM, kbtyo wrote:
> > Thank you for the elaboration. So, what I hear you saying is that (citing, "In this case, there's no further body, so it's going to be the same as "pass" (which
> > means "do nothing")") that the else block is not entered. For exma
>
> Seems like a cut-off paragraph here, but yes. In a try/except/else
> block, the 'else' block executes only if the 'try' didn't raise an
> exception of the specified type(s).
>
> > Do you mind elaborating on what you meant by "compatible headers?". The files that I am processing may or may not have the same headers (but if they do they should add the respective values only).
> >
>
> Your algorithm is basically: Take the entire first file, including its
> header, and then append all other files after skipping their first
> lines. If you want a smarter form of CSV merge, I would recommend
> using the 'csv' module, and probably doing a quick check of all files
> before you begin, so as to collect up the full set of headers. That'll
> also save you the hassle of playing around with StopIteration as you
> read in the headers.
>
> ChrisA
I have files that may have different headers. If they are different, they should be appended (along with their values). If there are duplicate headers, then their values should just be added.
I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
from glob import iglob
import csv
from collections import OrderedDict
files = sorted(iglob('*.csv'))
header = OrderedDict()
data = []
for filename in files:
with open(filename, 'r') as fin:
csvin = csv.DictReader(fin)
header.update(OrderedDict.fromkeys(csvin.fieldnames))
data.append(next(csvin))
with open('output_filename_version2.csv', 'w') as fout:
csvout = csv.DictWriter(fout, fieldnames=list(header))
csvout.writeheader()
csvout.writerows(data)
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-09-04 02:11 +1000 |
| Message-ID | <mailman.80.1441296690.8327.python-list@python.org> |
| In reply to | #95951 |
On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
>
> from glob import iglob
> import csv
> from collections import OrderedDict
>
> files = sorted(iglob('*.csv'))
> header = OrderedDict()
> data = []
>
> for filename in files:
> with open(filename, 'r') as fin:
> csvin = csv.DictReader(fin)
> header.update(OrderedDict.fromkeys(csvin.fieldnames))
> data.append(next(csvin))
>
> with open('output_filename_version2.csv', 'w') as fout:
> csvout = csv.DictWriter(fout, fieldnames=list(header))
> csvout.writeheader()
> csvout.writerows(data)
You're collecting up just one row from each file. Since you say your
input is measured in MB (not GB or anything bigger), the simplest
approach is probably fine: instead of "data.append(next(csvin))", just
use "data.extend(csvin)", which should grab them all. That'll store
all your input data in memory, which should be fine if it's only a few
meg, and probably not a problem for anything under a few hundred meg.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-09-03 09:35 -0700 |
| Message-ID | <6abd12de-2b10-4cc9-86db-a2f144ecc56a@googlegroups.com> |
| In reply to | #95953 |
On Thursday, September 3, 2015 at 12:12:04 PM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:57 AM, kbtyo <ahlusar.ahluwalia@gmail.com> wrote:
> > I have used CSV and collections. For some reason when I apply this algorithm, all of my files are not added (the output is ridiculously small considering how much goes in - think KB output vs MB input):
> >
> > from glob import iglob
> > import csv
> > from collections import OrderedDict
> >
> > files = sorted(iglob('*.csv'))
> > header = OrderedDict()
> > data = []
> >
> > for filename in files:
> > with open(filename, 'r') as fin:
> > csvin = csv.DictReader(fin)
> > header.update(OrderedDict.fromkeys(csvin.fieldnames))
> > data.append(next(csvin))
> >
> > with open('output_filename_version2.csv', 'w') as fout:
> > csvout = csv.DictWriter(fout, fieldnames=list(header))
> > csvout.writeheader()
> > csvout.writerows(data)
>
> You're collecting up just one row from each file. Since you say your
> input is measured in MB (not GB or anything bigger), the simplest
> approach is probably fine: instead of "data.append(next(csvin))", just
> use "data.extend(csvin)", which should grab them all. That'll store
> all your input data in memory, which should be fine if it's only a few
> meg, and probably not a problem for anything under a few hundred meg.
>
> ChrisA
Hmmmm - good point. However, I may have to deal with larger files, but thank you for the tip.
I am also wondering, based on what you stated, you are only "collecting up just one row from each file"....
I am fulfilling this, correct?
"I have files that may have different headers. If they are different, they should be appended (along with their values) into the output. If there are duplicate headers, then their values should just be added sequentially."
I am wondering how DictReader can skip empty rows by default and that this may be happening that also extrapolates to the other rows.
[toc] | [prev] | [next] | [standalone]
| From | kbtyo <ahlusar.ahluwalia@gmail.com> |
|---|---|
| Date | 2015-09-03 08:49 -0700 |
| Message-ID | <3be8f66b-38b8-4e32-ae8e-4ac613560677@googlegroups.com> |
| In reply to | #95942 |
On Thursday, September 3, 2015 at 11:27:58 AM UTC-4, Chris Angelico wrote:
> On Fri, Sep 4, 2015 at 1:05 AM, kbtyo wrote:
> > However, I am uncertain as to how this executes in a context like this:
> >
> > import glob
> > import csv
> > from collections import OrderedDict
> >
> > interesting_files = glob.glob("*.csv")
> >
> > header_saved = False
> > with open('merged_output_mod.csv','w') as fout:
> >
> > for filename in interesting_files:
> > print("execution here again")
> > with open(filename) as fin:
> > try:
> > header = next(fin)
> > print("Entering Try and Except")
> > except:
> > StopIteration
> > continue
>
> I think what you want here is:
>
> except StopIteration:
> continue
>
> The code you have will catch _any_ exception, and then look up the
> name StopIteration (and discard it).
>
> > else:
> > if not header_saved:
> > fout.write(header)
> > header_saved = True
> > print("We got here")
> > for line in fin:
> > fout.write(line)
> >
> > My questions are (for some reason my interpreter does not print out any readout):
> >
> > 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
> >
> > 2. How would a pass behave in this situation?
>
> The continue statement means "skip the rest of this loop's body and go
> to the next iteration of the loop, if there is one". In this case,
> there's no further body, so it's going to be the same as "pass" (which
> means "do nothing").
So what I hear you saying is I am not entering the else" block? Hence, when each file is read, the rest of the suite is not applied - specifically,
if not header_saved:
fout.write(header)
header_saved = True
print("We got here")
>
> For the rest, I think your code should be broadly functional. Of
> course, it assumes that your files all have compatible headers, but
> presumably you know that that's safe.
>
> ChrisA
Would you mind elaborating on what you meant by "compatible headers"? I have files that may have different headers. If they are different, they should be appended (along with their values). If there are duplicate headers, then their values should just be added.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-09-03 12:05 -0400 |
| Message-ID | <mailman.79.1441296357.8327.python-list@python.org> |
| In reply to | #95938 |
On 9/3/2015 11:05 AM, kbtyo wrote:
> I am experimenting with many exception handling and utilizing continue vs pass.
'pass' is a do-nothing place holder. 'continue' and 'break' are jump
statements
[snip]
> However, I am uncertain as to how this executes in a context like this:
>
> import glob
> import csv
> from collections import OrderedDict
>
> interesting_files = glob.glob("*.csv")
>
> header_saved = False
> with open('merged_output_mod.csv','w') as fout:
>
> for filename in interesting_files:
> print("execution here again")
> with open(filename) as fin:
> try:
> header = next(fin)
> print("Entering Try and Except")
> except:
> StopIteration
> continue
> else:
> if not header_saved:
> fout.write(header)
> header_saved = True
> print("We got here")
> for line in fin:
> fout.write(line)
>
> My questions are (for some reason my interpreter does not print out any readout):
>
> 1. after the exception is raised does the continue return back up to the beginning of the for loop (and the "else" conditional is not even encountered)?
>
> 2. How would a pass behave in this situation?
Try it for yourself. Copy the following into a python shell or editor
(and run) see what you get.
for i in [-1, 0, 1]:
try:
j = 2//i
except ZeroDivisionError:
print('infinity')
continue
else:
print(j)
Change 'continue' to 'pass' and run again.
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Luca Menegotto <otlucaDELETE@DELETEyahoo.it> |
|---|---|
| Date | 2015-09-03 18:37 +0200 |
| Message-ID | <ms9t12$ubn$1@speranza.aioe.org> |
| In reply to | #95938 |
Il 03/09/2015 17:05, kbtyo ha scritto:
> I am experimenting with many exception handling and utilizing
> continue vs pass. After pouring over a lot of material on SO
> and other forums I am still unclear as to the difference when
> setting variables and applying functions within multiple "for"
> loops.
'pass' and 'continue' have two very different meanings.
'pass' means 'don't do anything'; it's useful when you _have_ to put a
statement but you _don't_need_ to put a statement.
You can use it everywhere you want, with no other damage then adding a
little weight to your code.
A stupid example:
if i == 0:
pass
else:
do_something()
'continue', to be used in a loop (for or while) means 'ignore the rest
of the code and go immediatly to the next iteration'. The statement
refers to the nearest loop; so, if you have two nested loops, it refers
to the inner one; another stupid example:
for i in range(10):
for j in range(10):
if j < 5: continue
do_something(i, j) # called only if j >= 5
--
Ciao!
Luca
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web