Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73819

Re: fixing an horrific formatted csv file.

References <47e2e29d-b5c3-4aa6-abf9-3b1e46eb0dec@googlegroups.com> <mailman.11385.1404247829.18130.python-list@python.org> <0d3871c6-81d4-4168-9408-ad85299b0955@googlegroups.com>
Date 2014-07-02 11:20 +1000
Subject Re: fixing an horrific formatted csv file.
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.11392.1404264061.18130.python-list@python.org> (permalink)

Show all headers | View raw


On Wed, Jul 2, 2014 at 7:41 AM, flebber <flebber.crue@gmail.com> wrote:
> I understand why providing full solutions is frowned upon, because it doesn't assist in learning. Which is true,  it's incredibly helpful in this case.

In this case, my main reason for not providing a full solution is that
the work tends to be iterative. When I have a huge and messy file,
what I usually do is grab the first half-dozen lines and work out how
I'd go about fixing them manually, then write a script that does that.
Then run the script on the whole file, and see where it either chokes
or produces wrong data. Pick up the first few lines of wrong data,
figure out how to tweak the program to handle those. Rinse and repeat.

Often, what that results in is a file that gets progressively tidier.
When the scope of the mess is infinite (like with human-entered data -
believe you me, you haven't seen messy until you've seen what a
committee can do to a simple job), this means you stop working on the
script at exactly the point where it stops being worth the effort -
which is something that only you can decide.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 07:04 -0700
  Re: fixing an horrific formatted csv file. MRAB <python@mrabarnett.plus.com> - 2014-07-01 15:32 +0100
  Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-01 22:49 +0200
    Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 14:41 -0700
      Re: fixing an horrific formatted csv file. Chris Angelico <rosuav@gmail.com> - 2014-07-02 11:20 +1000
        Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-02 02:13 -0700
          Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-02 17:51 +0200
            Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-03 21:12 -0700
              Re: fixing an horrific formatted csv file. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-07-04 18:19 +1200
                Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:48 -0700
              Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:28 -0700
                Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-04 15:24 +0200

csiph-web