Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #73949
| X-Received | by 10.182.128.166 with SMTP id np6mr5410354obb.16.1404470891330; Fri, 04 Jul 2014 03:48:11 -0700 (PDT) |
|---|---|
| X-Received | by 10.50.79.137 with SMTP id j9mr1147949igx.6.1404470891178; Fri, 04 Jul 2014 03:48:11 -0700 (PDT) |
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!uq10no1448356igb.0!news-out.google.com!bp9ni2745igb.0!nntp.google.com!hn18no2909878igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail |
| Newsgroups | comp.lang.python |
| Date | Fri, 4 Jul 2014 03:48:10 -0700 (PDT) |
| In-Reply-To | <c1mvavFhpj7U1@mid.individual.net> |
| Complaints-To | groups-abuse@google.com |
| Injection-Info | glegroupsg2000goo.googlegroups.com; posting-host=101.161.167.40; posting-account=5Cd8QAoAAAC6AxpkrISTgUBJ9ktgwNBm |
| NNTP-Posting-Host | 101.161.167.40 |
| References | <47e2e29d-b5c3-4aa6-abf9-3b1e46eb0dec@googlegroups.com> <mailman.11385.1404247829.18130.python-list@python.org> <0d3871c6-81d4-4168-9408-ad85299b0955@googlegroups.com> <mailman.11392.1404264061.18130.python-list@python.org> <a84826ea-4018-40bc-88c1-812be5417e6b@googlegroups.com> <mailman.11411.1404316334.18130.python-list@python.org> <11ecf009-6f81-4fa5-bee9-b52b9407f0af@googlegroups.com> <c1mvavFhpj7U1@mid.individual.net> |
| User-Agent | G2/1.0 |
| MIME-Version | 1.0 |
| Message-ID | <bee0ef3a-4ec3-4ca6-aa82-dffccf09dc67@googlegroups.com> (permalink) |
| Subject | Re: fixing an horrific formatted csv file. |
| From | flebber <flebber.crue@gmail.com> |
| Injection-Date | Fri, 04 Jul 2014 10:48:11 +0000 |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Xref | csiph.com comp.lang.python:73949 |
Show key headers only | View raw
On Friday, 4 July 2014 16:19:09 UTC+10, Gregory Ewing wrote:
> flebber wrote:
>
> > so in my file I had on line 44 this trainer name.
>
> >
>
> > "Michael, Wayne & John Hawkes"
>
> >
>
> > and in line 95 this horse name. Inz'n'out
>
> >
>
> > this throws of my capturing correct item 9. How do I protect against this?
>
>
>
> Use python's csv module to read the file. Don't try to
>
> do it yourself; the rules for handling embedded commas
>
> and quotes in csv are quite complicated. As long as
>
> the file is a well-formed csv file, the csv module
>
> should parse fields like that correctly.
>
>
>
> --
>
> Greg
True Greg worked easier
def race_table(text_file):
"""utility to reorganise poorly made csv entry"""
# input_table = [[item.strip(' "') for item in record.split(',')]
# for record in text_file.splitlines()]
# At this point look at input_table to find the record indices
# identity = string.maketrans("", "")
# print(input_table)
# input_table = [s.translate(identity, ",'") for s
# in input_table]
output_table = []
for record in text_file:
if record[0] == 'Meeting':
meeting = record[3]
elif record[0] == 'Race':
date = record[13]
race = record[1]
elif record[0] == 'Horse':
number = record[1]
name = record[2]
results = record[9]
res_split = re.split('[- ]', results)
starts = res_split[0]
wins = res_split[1]
seconds = res_split[2]
thirds = res_split[3]
try:
prizemoney = res_split[4]
finally:
prizemoney = 0
trainer = record[4]
location = record[5]
print(name, wins, seconds)
output_table.append((meeting, date, race, number, name,
starts, wins, seconds, thirds, prizemoney,
trainer, location))
return output_table
MY_FILE = out_file_name(FILENAME)
# with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:
# for line in race_table(f_in.readline()):
# new_row = line
with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:
CONTENT = csv.reader(f_in)
# print(content)
FILE_CONTENTS = race_table(CONTENT)
# print new_name
f_out.write(str(FILE_CONTENTS))
if __name__ == '__main__':
pass
Sayth
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 07:04 -0700
Re: fixing an horrific formatted csv file. MRAB <python@mrabarnett.plus.com> - 2014-07-01 15:32 +0100
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-01 22:49 +0200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 14:41 -0700
Re: fixing an horrific formatted csv file. Chris Angelico <rosuav@gmail.com> - 2014-07-02 11:20 +1000
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-02 02:13 -0700
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-02 17:51 +0200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-03 21:12 -0700
Re: fixing an horrific formatted csv file. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-07-04 18:19 +1200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:48 -0700
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:28 -0700
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-04 15:24 +0200
csiph-web