Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #64797 > unrolled thread
| Started by | matt.s.marotta@gmail.com |
|---|---|
| First post | 2014-01-26 13:46 -0800 |
| Last post | 2014-01-26 19:07 -0800 |
| Articles | 12 — 6 participants |
Back to article view | Back to comp.lang.python
Unwanted Spaces and Iterative Loop matt.s.marotta@gmail.com - 2014-01-26 13:46 -0800
Re: Unwanted Spaces and Iterative Loop Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-26 22:20 +0000
Re: Unwanted Spaces and Iterative Loop MRAB <python@mrabarnett.plus.com> - 2014-01-26 23:28 +0000
Re: Unwanted Spaces and Iterative Loop Jason Friedman <jsf80238@gmail.com> - 2014-01-26 16:44 -0700
Re: Unwanted Spaces and Iterative Loop matt.s.marotta@gmail.com - 2014-01-26 15:56 -0800
Re: Unwanted Spaces and Iterative Loop Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-27 00:40 +0000
Re: Unwanted Spaces and Iterative Loop matt.s.marotta@gmail.com - 2014-01-26 17:15 -0800
Re: Unwanted Spaces and Iterative Loop Chris Angelico <rosuav@gmail.com> - 2014-01-27 12:56 +1100
Re: Unwanted Spaces and Iterative Loop matt.s.marotta@gmail.com - 2014-01-26 17:58 -0800
Re: Unwanted Spaces and Iterative Loop Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-27 13:32 +0000
Re: Unwanted Spaces and Iterative Loop Jason Friedman <jsf80238@gmail.com> - 2014-01-26 19:00 -0700
Re: Unwanted Spaces and Iterative Loop matt.s.marotta@gmail.com - 2014-01-26 19:07 -0800
| From | matt.s.marotta@gmail.com |
|---|---|
| Date | 2014-01-26 13:46 -0800 |
| Subject | Unwanted Spaces and Iterative Loop |
| Message-ID | <988fec60-228a-4427-b07e-b4327c7e02ae@googlegroups.com> |
I have been working on a python script that separates mailing addresses into different components.
Here is my code:
inFile = "directory"
outFile = "directory"
inHandler = open(inFile, 'r')
outHandler = open(outFile, 'w')
outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode")
for line in inHandler:
str = line.replace("FarmID\tAddress", " ")
outHandler.write(str[0:-1])
str = str.replace(" ","\t", 1)
str = str.replace(" Rd,","\tRd\t\t")
str = str.replace(" Rd","\tRd\t")
str = str.replace("Ave,","\tAve\t\t")
str = str.replace("Ave ","\tAve\t\t")
str = str.replace("St ","\tSt\t\t")
str = str.replace("St,","\tSt\t\t")
str = str.replace("Dr,","\tDr\t\t")
str = str.replace("Lane,","\tLane\t\t")
str = str.replace("Pky,","\tPky\t\t")
str = str.replace(" Sq,","\tSq\t\t")
str = str.replace(" Pl,","\tPl\t\t")
str = str.replace("\tE,","E\t")
str = str.replace("\tN,","N\t")
str = str.replace("\tS,","S\t")
str = str.replace("\tW,","W\t")
str = str.replace(",","\t")
str = str.replace(" ON","ON\t")
outHandler.write(str)
inHandler.close()
The text file that this manipulates has 91 addresses, so I'll just paste 5 of them in here to get the idea:
FarmID Address
1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
2 4260 Mountainview Rd, Lincoln, ON L0R 1B2
3 25 Hunter Rd, Grimsby, ON L3M 4A3
4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0
My issue is that in the output file, there is a space before each city and each postal code that I do not want there.
Furthermore, the FarmID is being added on to the end of the postal code under the original address column for each address. This also is not supposed to be happening, and I am having trouble designing an iterative loop to remove/prevent that from happening.
Any help is greatly appreciated!
[toc] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-01-26 22:20 +0000 |
| Message-ID | <mailman.6004.1390774848.18130.python-list@python.org> |
| In reply to | #64797 |
On 26/01/2014 21:46, matt.s.marotta@gmail.com wrote:
> I have been working on a python script that separates mailing addresses into different components.
>
> Here is my code:
>
> inFile = "directory"
> outFile = "directory"
> inHandler = open(inFile, 'r')
> outHandler = open(outFile, 'w')
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode")
> for line in inHandler:
> str = line.replace("FarmID\tAddress", " ")
> outHandler.write(str[0:-1])
>
> str = str.replace(" ","\t", 1)
> str = str.replace(" Rd,","\tRd\t\t")
> str = str.replace(" Rd","\tRd\t")
> str = str.replace("Ave,","\tAve\t\t")
> str = str.replace("Ave ","\tAve\t\t")
> str = str.replace("St ","\tSt\t\t")
> str = str.replace("St,","\tSt\t\t")
> str = str.replace("Dr,","\tDr\t\t")
> str = str.replace("Lane,","\tLane\t\t")
> str = str.replace("Pky,","\tPky\t\t")
> str = str.replace(" Sq,","\tSq\t\t")
> str = str.replace(" Pl,","\tPl\t\t")
>
> str = str.replace("\tE,","E\t")
> str = str.replace("\tN,","N\t")
> str = str.replace("\tS,","S\t")
> str = str.replace("\tW,","W\t")
> str = str.replace(",","\t")
> str = str.replace(" ON","ON\t")
>
>
> outHandler.write(str)
> inHandler.close()
>
> The text file that this manipulates has 91 addresses, so I'll just paste 5 of them in here to get the idea:
>
> FarmID Address
> 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
> 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2
> 3 25 Hunter Rd, Grimsby, ON L3M 4A3
> 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0
>
> My issue is that in the output file, there is a space before each city and each postal code that I do not want there.
>
> Furthermore, the FarmID is being added on to the end of the postal code under the original address column for each address. This also is not supposed to be happening, and I am having trouble designing an iterative loop to remove/prevent that from happening.
>
> Any help is greatly appreciated!
>
Make your life easier by using the csv module to read and write your
data, the write using the excel-tab dialect, see
http://docs.python.org/3/library/csv.html#module-csv
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2014-01-26 23:28 +0000 |
| Message-ID | <mailman.6006.1390778925.18130.python-list@python.org> |
| In reply to | #64797 |
On 2014-01-26 21:46, matt.s.marotta@gmail.com wrote:
> I have been working on a python script that separates mailing addresses into different components.
>
> Here is my code:
>
> inFile = "directory"
> outFile = "directory"
> inHandler = open(inFile, 'r')
> outHandler = open(outFile, 'w')
Shouldn't you be writing a '\n' at the end of the line?
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode")
> for line in inHandler:
This is being done on every single line of the file:
> str = line.replace("FarmID\tAddress", " ")
> outHandler.write(str[0:-1])
>
> str = str.replace(" ","\t", 1)
> str = str.replace(" Rd,","\tRd\t\t")
> str = str.replace(" Rd","\tRd\t")
> str = str.replace("Ave,","\tAve\t\t")
> str = str.replace("Ave ","\tAve\t\t")
> str = str.replace("St ","\tSt\t\t")
> str = str.replace("St,","\tSt\t\t")
> str = str.replace("Dr,","\tDr\t\t")
> str = str.replace("Lane,","\tLane\t\t")
> str = str.replace("Pky,","\tPky\t\t")
> str = str.replace(" Sq,","\tSq\t\t")
> str = str.replace(" Pl,","\tPl\t\t")
>
> str = str.replace("\tE,","E\t")
> str = str.replace("\tN,","N\t")
> str = str.replace("\tS,","S\t")
> str = str.replace("\tW,","W\t")
> str = str.replace(",","\t")
> str = str.replace(" ON","ON\t")
>
>
> outHandler.write(str)
> inHandler.close()
>
> The text file that this manipulates has 91 addresses, so I'll just paste 5 of them in here to get the idea:
>
> FarmID Address
> 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
> 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2
> 3 25 Hunter Rd, Grimsby, ON L3M 4A3
> 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0
>
> My issue is that in the output file, there is a space before each city and each postal code that I do not want there.
>
You could try splitting on '\t', stripping the leading and trailing
whitespace on each part, and then joining them together again with
'\t'. (Make sure that you also write the '\n' at the end of line.)
> Furthermore, the FarmID is being added on to the end of the postal code under the original address column for each address. This also is not supposed to be happening, and I am having trouble designing an iterative loop to remove/prevent that from happening.
>
> Any help is greatly appreciated!
>
As Mark said, you could also use the CSV module.
[toc] | [prev] | [next] | [standalone]
| From | Jason Friedman <jsf80238@gmail.com> |
|---|---|
| Date | 2014-01-26 16:44 -0700 |
| Message-ID | <mailman.6007.1390779864.18130.python-list@python.org> |
| In reply to | #64797 |
[Multipart message — attachments visible in raw view] — view raw
>
>
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode")
>
> ...
> FarmID Address
> 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
> 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2
> 3 25 Hunter Rd, Grimsby, ON L3M 4A3
> 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0
>
>
You are wanting to produce tab-separated output, with an "Address" field
plus the Address split into fields for Street Number, Street Name, Suffix
Type, Direction?
The four lines you have pasted are examples of your input? If yes,
"Direction" is a single letter?
[toc] | [prev] | [next] | [standalone]
| From | matt.s.marotta@gmail.com |
|---|---|
| Date | 2014-01-26 15:56 -0800 |
| Message-ID | <2fa1fd26-7a0c-4718-ae39-3fbed8537df3@googlegroups.com> |
| In reply to | #64802 |
On Sunday, 26 January 2014 18:44:16 UTC-5, Jason Friedman wrote:
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode")
>
>
>
> ...
>
> FarmID Address
>
> 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
>
> 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2
>
> 3 25 Hunter Rd, Grimsby, ON L3M 4A3
>
> 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0
>
>
> You are wanting to produce tab-separated output, with an "Address" field plus the Address split into fields for Street Number, Street Name, Suffix Type, Direction?
>
>
>
> The four lines you have pasted are examples of your input? If yes, "Direction" is a single letter?
Yes to your first question. Yes, the four lines I have pasted are examples of input. Direction is a single letter (there are some records that are `King St. E,`). I have solved the problem with the spaces, but still cannot figure out the iterative loop to get rid of the farm ID in the address column that isn`t split.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-01-27 00:40 +0000 |
| Message-ID | <52e5aafa$0$29999$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #64797 |
On Sun, 26 Jan 2014 13:46:21 -0800, matt.s.marotta wrote:
> I have been working on a python script that separates mailing addresses
> into different components.
>
> Here is my code:
>
> inFile = "directory"
> outFile = "directory"
> inHandler = open(inFile, 'r')
> outHandler = open(outFile, 'w')
Are you *really* opening the same file for reading and writing at the
same time?
Even if your operating system allows that, surely it's not a good idea.
You might get away with it for small files, but at some point you're
going to run into weird, hard-to-diagnose bugs.
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir
\tCity\tProvince\tPostalCode")
This looks like a CSV file using tabs as the separator. You really ought
to use the csv module.
http://docs.python.org/3/library/csv.html
http://docs.python.org/2/library/csv.html
http://pymotw.com/2/csv/
> for line in inHandler:
> str = line.replace("FarmID\tAddress", " ")
> outHandler.write(str[0:-1])
> str = str.replace(" ","\t", 1)
> str = str.replace(" Rd,","\tRd\t\t")
> str = str.replace(" Rd","\tRd\t")
> str = str.replace("Ave,","\tAve\t\t")
> str = str.replace("Ave","\tAve\t\t")
> str = str.replace("St ","\tSt\t\t")
> str = str.replace("St,","\tSt\t\t")
> str = str.replace("Dr,","\tDr\t\t")
[snip additional string manipulations]
> str = str.replace(",","\t")
> str = str.replace(" ON","ON\t")
> outHandler.write(str)
Aiy aiy aiy, what a mess! I get a headache just trying to understand it!
The first question that comes to mind is that you appear to be writing
each input line *twice*, first after a very minimal set of string
manipulations (you convert the literal string "FarmID\tAddress" to a
space, then write the whole line out), the second time after a whole mess
of string replacements. Why?
If the sample data you show below is accurate, I *think* what you are
trying to do is simply suppress the header line. The first line in the
input file is:
FarmID Address
and rather than write that you want to write a space. I don't know why
you want the output file to begin with a space, but this would be better:
for line in inHandler:
line = line.strip() # Remove any leading and trailing whitespace,
# including the trailing newline. Later, we'll add a newline
# back in.
if line == "FarmID\tAddress":
outHandler.write(" ") # Write a mysterious space.
continue # And skip to the next line.
# Now process the non-header lines.
Now, as far as the non-header lines, you do a whole lot of complex string
manipulations, replacing chunks of text with or without tabs or commas to
the same text with or without tabs but in a different order. The logic of
these manipulations completely escape me: what are you actually trying to
do here?
I *strongly* suggest that you don't try to implement your program logic
in the form of string manipulations. According to your sample data, your
data looks like this:
1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
i.e.
farmId TAB address COMMA district COMMA postcode
It is much better to pull the line apart into named components,
manipulate the components directly, then put it back together in the
order you want. This makes the code more understandable, and easier to
change if you ever need to change things.
for line in inHandler:
line = line.strip()
if line == "FarmID\tAddress":
outHandler.write(" ") # Write a mysterious space.
continue
# Now process the non-header lines.
farmid, address = line.split("\t")
farmid = farmid.strip()
address, district, postcode = address.split(",")
address = address.strip()
district = district.strip()
postcode = postcode.strip()
# Now process the fields however you like.
parts_of_address = address.split(" ")
street_number = parts_of_address[0] # first part
street_type = parts_of_address[-1] # last part
street_name = parts_of_address[1:-1] # everything else
street_name = " ".join(street_name)
and so on for the post code. Then, at the very end, assemble the parts
you want to write out, join them with tabs, and write:
fields = [farmid, street_number, street_name, street_type, ... ]
outHandler.write("\t".join(fields))
outHandler.write("\n")
Or use the csv module to do the actual writing. It will handle escaping
anything that needs escaping, newlines, tabs, etc.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | matt.s.marotta@gmail.com |
|---|---|
| Date | 2014-01-26 17:15 -0800 |
| Message-ID | <0503362d-7f14-441f-9291-75711eddd283@googlegroups.com> |
| In reply to | #64807 |
On Sunday, 26 January 2014 19:40:26 UTC-5, Steven D'Aprano wrote:
> On Sun, 26 Jan 2014 13:46:21 -0800, matt.s.marotta wrote:
>
>
>
> > I have been working on a python script that separates mailing addresses
>
> > into different components.
>
> >
>
> > Here is my code:
>
> >
>
> > inFile = "directory"
>
> > outFile = "directory"
>
> > inHandler = open(inFile, 'r')
>
> > outHandler = open(outFile, 'w')
>
>
>
> Are you *really* opening the same file for reading and writing at the
>
> same time?
>
>
>
> Even if your operating system allows that, surely it's not a good idea.
>
> You might get away with it for small files, but at some point you're
>
> going to run into weird, hard-to-diagnose bugs.
>
>
>
>
>
> > outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir
>
> \tCity\tProvince\tPostalCode")
>
>
>
> This looks like a CSV file using tabs as the separator. You really ought
>
> to use the csv module.
>
>
>
> http://docs.python.org/3/library/csv.html
>
> http://docs.python.org/2/library/csv.html
>
>
>
> http://pymotw.com/2/csv/
>
>
>
>
>
> > for line in inHandler:
>
> > str = line.replace("FarmID\tAddress", " ")
>
> > outHandler.write(str[0:-1])
>
> > str = str.replace(" ","\t", 1)
>
> > str = str.replace(" Rd,","\tRd\t\t")
>
> > str = str.replace(" Rd","\tRd\t")
>
> > str = str.replace("Ave,","\tAve\t\t")
>
> > str = str.replace("Ave","\tAve\t\t")
>
> > str = str.replace("St ","\tSt\t\t")
>
> > str = str.replace("St,","\tSt\t\t")
>
> > str = str.replace("Dr,","\tDr\t\t")
>
> [snip additional string manipulations]
>
> > str = str.replace(",","\t")
>
> > str = str.replace(" ON","ON\t")
>
> > outHandler.write(str)
>
>
>
>
>
> Aiy aiy aiy, what a mess! I get a headache just trying to understand it!
>
>
>
> The first question that comes to mind is that you appear to be writing
>
> each input line *twice*, first after a very minimal set of string
>
> manipulations (you convert the literal string "FarmID\tAddress" to a
>
> space, then write the whole line out), the second time after a whole mess
>
> of string replacements. Why?
>
>
>
> If the sample data you show below is accurate, I *think* what you are
>
> trying to do is simply suppress the header line. The first line in the
>
> input file is:
>
>
>
> FarmID Address
>
>
>
> and rather than write that you want to write a space. I don't know why
>
> you want the output file to begin with a space, but this would be better:
>
>
>
> for line in inHandler:
>
> line = line.strip() # Remove any leading and trailing whitespace,
>
> # including the trailing newline. Later, we'll add a newline
>
> # back in.
>
> if line == "FarmID\tAddress":
>
> outHandler.write(" ") # Write a mysterious space.
>
> continue # And skip to the next line.
>
> # Now process the non-header lines.
>
>
>
>
>
> Now, as far as the non-header lines, you do a whole lot of complex string
>
> manipulations, replacing chunks of text with or without tabs or commas to
>
> the same text with or without tabs but in a different order. The logic of
>
> these manipulations completely escape me: what are you actually trying to
>
> do here?
>
>
>
> I *strongly* suggest that you don't try to implement your program logic
>
> in the form of string manipulations. According to your sample data, your
>
> data looks like this:
>
>
>
> 1 1067 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0
>
>
>
> i.e.
>
>
>
> farmId TAB address COMMA district COMMA postcode
>
>
>
> It is much better to pull the line apart into named components,
>
> manipulate the components directly, then put it back together in the
>
> order you want. This makes the code more understandable, and easier to
>
> change if you ever need to change things.
>
>
>
> for line in inHandler:
>
> line = line.strip()
>
> if line == "FarmID\tAddress":
>
> outHandler.write(" ") # Write a mysterious space.
>
> continue
>
> # Now process the non-header lines.
>
> farmid, address = line.split("\t")
>
> farmid = farmid.strip()
>
> address, district, postcode = address.split(",")
>
> address = address.strip()
>
> district = district.strip()
>
> postcode = postcode.strip()
>
> # Now process the fields however you like.
>
> parts_of_address = address.split(" ")
>
> street_number = parts_of_address[0] # first part
>
> street_type = parts_of_address[-1] # last part
>
> street_name = parts_of_address[1:-1] # everything else
>
> street_name = " ".join(street_name)
>
>
>
> and so on for the post code. Then, at the very end, assemble the parts
>
> you want to write out, join them with tabs, and write:
>
>
>
> fields = [farmid, street_number, street_name, street_type, ... ]
>
> outHandler.write("\t".join(fields))
>
> outHandler.write("\n")
>
>
>
>
>
> Or use the csv module to do the actual writing. It will handle escaping
>
> anything that needs escaping, newlines, tabs, etc.
>
>
>
>
>
>
>
> --
>
> Steven
I`m not reading and writing to the same file, I just changed the actual paths to directory.
This is for a school assignment, and we haven`t been taught any of the stuff you`re talking about. Although I appreciate your help, everything needs to stay as is and I just need to create the loop to get rid of the farmID from the end of the postal codes.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-01-27 12:56 +1100 |
| Message-ID | <mailman.6014.1390787771.18130.python-list@python.org> |
| In reply to | #64811 |
On Mon, Jan 27, 2014 at 12:15 PM, <matt.s.marotta@gmail.com> wrote: > I`m not reading and writing to the same file, I just changed the actual paths to directory. For next time, say "directory1" and "directory2" to preserve the fact that they're different. Though if they're file names, I'd use "file1" and "file2" - calling them "directory" implies that they are, well, directories :) ChrisA
[toc] | [prev] | [next] | [standalone]
| From | matt.s.marotta@gmail.com |
|---|---|
| Date | 2014-01-26 17:58 -0800 |
| Message-ID | <c3020066-92f1-4d14-9269-7f855718bd98@googlegroups.com> |
| In reply to | #64815 |
On Sunday, 26 January 2014 20:56:01 UTC-5, Chris Angelico wrote: > On Mon, Jan 27, 2014 at 12:15 PM, <matt.s.marotta@gmail.com> wrote: > > > I`m not reading and writing to the same file, I just changed the actual paths to directory. > > > > For next time, say "directory1" and "directory2" to preserve the fact > > that they're different. Though if they're file names, I'd use "file1" > > and "file2" - calling them "directory" implies that they are, well, > > directories :) > > > > ChrisA Thanks, but any chance you could help me out with my question of removing the FarmID from the postal code?
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-01-27 13:32 +0000 |
| Message-ID | <mailman.6038.1390829706.18130.python-list@python.org> |
| In reply to | #64816 |
On 27/01/2014 01:58, matt.s.marotta@gmail.com wrote: > On Sunday, 26 January 2014 20:56:01 UTC-5, Chris Angelico wrote: >> On Mon, Jan 27, 2014 at 12:15 PM, <matt.s.marotta@gmail.com> wrote: >> >>> I`m not reading and writing to the same file, I just changed the actual paths to directory. >> >> >> >> For next time, say "directory1" and "directory2" to preserve the fact >> >> that they're different. Though if they're file names, I'd use "file1" >> >> and "file2" - calling them "directory" implies that they are, well, >> >> directories :) >> >> >> >> ChrisA > > Thanks, but any chance you could help me out with my question of removing the FarmID from the postal code? > Any chance that you could read and action this https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the double line spacing above? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Jason Friedman <jsf80238@gmail.com> |
|---|---|
| Date | 2014-01-26 19:00 -0700 |
| Message-ID | <mailman.6015.1390788044.18130.python-list@python.org> |
| In reply to | #64811 |
[Multipart message — attachments visible in raw view] — view raw
>
> I`m not reading and writing to the same file, I just changed the actual
> paths to directory.
>
> This is for a school assignment, and we haven`t been taught any of the
> stuff you`re talking about. Although I appreciate your help, everything
> needs to stay as is and I just need to create the loop to get rid of the
> farmID from the end of the postal codes.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
If you are allowed to use if/then this seems to work:
inFile = "data"
outFile = "processed"
inHandler = open(inFile, 'r')
outHandler = open(outFile, 'w')
for line in inHandler:
if line.startswith("FarmID"):
outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode\n")
else:
line = line.replace(" ","\t", 1)
line = line.replace(" Rd,","\tRd\t\t")
line = line.replace(" Rd","\tRd\t")
line = line.replace("Ave,","\tAve\t\t")
line = line.replace("Ave ","\tAve\t\t")
line = line.replace("St ","\tSt\t\t")
line = line.replace("St,","\tSt\t\t")
line = line.replace("Dr,","\tDr\t\t")
line = line.replace("Lane,","\tLane\t\t")
line = line.replace("Pky,","\tPky\t\t")
line = line.replace(" Sq,","\tSq\t\t")
line = line.replace(" Pl,","\tPl\t\t")
line = line.replace("\tE,","E\t")
line = line.replace("\tN,","N\t")
line = line.replace("\tS,","S\t")
line = line.replace("\tW,","W\t")
line = line.replace(",","\t")
line = line.replace(" ON","ON\t")
outHandler.write(line)
inHandler.close()
[toc] | [prev] | [next] | [standalone]
| From | matt.s.marotta@gmail.com |
|---|---|
| Date | 2014-01-26 19:07 -0800 |
| Message-ID | <a408da1e-15ad-4f7b-b939-334a28486902@googlegroups.com> |
| In reply to | #64817 |
On Sunday, 26 January 2014 21:00:35 UTC-5, Jason Friedman wrote:
> I`m not reading and writing to the same file, I just changed the actual paths to directory.
>
>
>
> This is for a school assignment, and we haven`t been taught any of the stuff you`re talking about. Although I appreciate your help, everything needs to stay as is and I just need to create the loop to get rid of the farmID from the end of the postal codes.
>
>
> --
>
> https://mail.python.org/mailman/listinfo/python-list
>
>
>
> If you are allowed to use if/then this seems to work:
>
>
>
> inFile = "data"
>
> outFile = "processed"
> inHandler = open(inFile, 'r')
> outHandler = open(outFile, 'w')
>
> for line in inHandler:
> if line.startswith("FarmID"):
> outHandler.write("FarmID\tAddress\tStreetNum\tStreetName\tSufType\tDir\tCity\tProvince\tPostalCode\n")
>
> else:
> line = line.replace(" ","\t", 1)
> line = line.replace(" Rd,","\tRd\t\t")
>
> line = line.replace(" Rd","\tRd\t")
> line = line.replace("Ave,","\tAve\t\t")
> line = line.replace("Ave ","\tAve\t\t")
>
> line = line.replace("St ","\tSt\t\t")
> line = line.replace("St,","\tSt\t\t")
> line = line.replace("Dr,","\tDr\t\t")
>
> line = line.replace("Lane,","\tLane\t\t")
> line = line.replace("Pky,","\tPky\t\t")
>
> line = line.replace(" Sq,","\tSq\t\t")
> line = line.replace(" Pl,","\tPl\t\t")
>
>
>
> line = line.replace("\tE,","E\t")
> line = line.replace("\tN,","N\t")
> line = line.replace("\tS,","S\t")
>
> line = line.replace("\tW,","W\t")
> line = line.replace(",","\t")
> line = line.replace(" ON","ON\t")
>
>
>
> outHandler.write(line)
> inHandler.close()
Unfortunately this did not work - the columns get messed up and there is no column for the full address.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web