Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #73819
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed2a.news.xs4all.nl!xs4all!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.067 |
| X-Spam-Evidence | '*H*': 0.87; '*S*': 0.00; 'true,': 0.05; 'stops': 0.07; 'mess': 0.09; 'cc:addr:python-list': 0.11; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'messy': 0.16; 'tends': 0.16; 'tweak': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'file,': 0.19; 'cc:addr:python.org': 0.22; 'case.': 0.24; 'helpful': 0.24; "haven't": 0.24; 'cc:2**0': 0.24; 'script': 0.25; 'gets': 0.27; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'am,': 0.29; "doesn't": 0.30; '(like': 0.30; 'message- id:@mail.gmail.com': 0.30; 'lines': 0.31; 'that.': 0.31; 'usually': 0.31; 'fixing': 0.31; 'produces': 0.31; 'file': 0.32; 'figure': 0.32; 'run': 0.32; "i'd": 0.34; 'something': 0.35; 'case,': 0.35; 'received:google.com': 0.35; 'data,': 0.36; 'effort': 0.37; 'wrong': 0.37; 'being': 0.38; 'handle': 0.38; 'does': 0.39; 'either': 0.39; 'how': 0.40; 'full': 0.61; 'providing': 0.61; 'simple': 0.61; 'first': 0.61; "you've": 0.63; 'grab': 0.64; 'pick': 0.64; 'worth': 0.66; 'believe': 0.68; 'results': 0.69; 'jul': 0.74; 'repeat.': 0.84; 'rinse': 0.84; 'to:none': 0.92; 'incredibly': 0.96 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=tXqIOkSjkbK6qKsBVyCvx5NEWG9Iotc9IAodZ1vds1o=; b=J8eohTiLPbKP42ThrlnOCAdqpjnjAn3ob5o/IlzBnHODMy9KGKjQHnHk+CV/oel7MN yQOBYzVW0u8otowFWWALFhmp8Knrv7qFLs4vt+ntRTX7qXU2hZdloW3beVPs3a0zOv0b LdnDTpHec+lZa9CfgIOi884tVqDH3iQgpCtYGKzfQe0D2QGDB5qCerHOY+8+qRxvQEBI kuMaFfi575ORfWowU3JEXAg5UruS98bkRUzKQXyUM0ETzyjnQkGivlXceiBbvmZg37KI ZT/DkCC7av7Lk7SLkEehZm9KrpDabYhezEU1b1K5jw0JOMx6oQ3d8+cIgiReP1xVnmI6 t8EQ== |
| MIME-Version | 1.0 |
| X-Received | by 10.221.55.70 with SMTP id vx6mr29851976vcb.23.1404264053513; Tue, 01 Jul 2014 18:20:53 -0700 (PDT) |
| In-Reply-To | <0d3871c6-81d4-4168-9408-ad85299b0955@googlegroups.com> |
| References | <47e2e29d-b5c3-4aa6-abf9-3b1e46eb0dec@googlegroups.com> <mailman.11385.1404247829.18130.python-list@python.org> <0d3871c6-81d4-4168-9408-ad85299b0955@googlegroups.com> |
| Date | Wed, 2 Jul 2014 11:20:53 +1000 |
| Subject | Re: fixing an horrific formatted csv file. |
| From | Chris Angelico <rosuav@gmail.com> |
| Cc | "python-list@python.org" <python-list@python.org> |
| Content-Type | text/plain; charset=UTF-8 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.11392.1404264061.18130.python-list@python.org> (permalink) |
| Lines | 19 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1404264061 news.xs4all.nl 2829 [2001:888:2000:d::a6]:48972 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:73819 |
Show key headers only | View raw
On Wed, Jul 2, 2014 at 7:41 AM, flebber <flebber.crue@gmail.com> wrote: > I understand why providing full solutions is frowned upon, because it doesn't assist in learning. Which is true, it's incredibly helpful in this case. In this case, my main reason for not providing a full solution is that the work tends to be iterative. When I have a huge and messy file, what I usually do is grab the first half-dozen lines and work out how I'd go about fixing them manually, then write a script that does that. Then run the script on the whole file, and see where it either chokes or produces wrong data. Pick up the first few lines of wrong data, figure out how to tweak the program to handle those. Rinse and repeat. Often, what that results in is a file that gets progressively tidier. When the scope of the mess is infinite (like with human-entered data - believe you me, you haven't seen messy until you've seen what a committee can do to a simple job), this means you stop working on the script at exactly the point where it stops being worth the effort - which is something that only you can decide. ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 07:04 -0700
Re: fixing an horrific formatted csv file. MRAB <python@mrabarnett.plus.com> - 2014-07-01 15:32 +0100
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-01 22:49 +0200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-01 14:41 -0700
Re: fixing an horrific formatted csv file. Chris Angelico <rosuav@gmail.com> - 2014-07-02 11:20 +1000
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-02 02:13 -0700
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-02 17:51 +0200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-03 21:12 -0700
Re: fixing an horrific formatted csv file. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-07-04 18:19 +1200
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:48 -0700
Re: fixing an horrific formatted csv file. flebber <flebber.crue@gmail.com> - 2014-07-04 03:28 -0700
Re: fixing an horrific formatted csv file. "F.R." <anthra.norell@bluewin.ch> - 2014-07-04 15:24 +0200
csiph-web