Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #12416

Re: Help parsing a text file

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!xlned.com!feeder7.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!194.109.133.85.MISMATCH!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <philip@semanchuk.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.002
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'passes': 0.05; 'xml,': 0.05; 'python': 0.08; 'scripts': 0.09; '(possibly': 0.09; 'subject:parsing': 0.09; 'suggestions,': 0.09; 'underlying': 0.09; 'subject:file': 0.13; 'bouncing': 0.16; 'clear.': 0.16; 'disciplined': 0.16; 'head,': 0.16; 'messy': 0.16; 'nudge': 0.16; 'py3': 0.16; 'received:mindspring.com': 0.16; 'sequence.': 0.16; 'x-mailer:apple mail (2.1084)': 0.16; 'syntax': 0.16; 'wrote:': 0.16; 'subject:Help': 0.17; 'convert': 0.19; 'defined': 0.19; "haven't": 0.20; 'header:In-Reply-To:1': 0.22; 'cheers': 0.23; 'appear': 0.23; 'pm,': 0.24; 'aug': 0.24; 'libraries': 0.24; 'xml': 0.25; "i'm": 0.27; 'code,': 0.28; 'toward': 0.29; 'asking': 0.29; 'changes': 0.31; 'seem': 0.31; 'minor': 0.32; 'received:24': 0.32; 'closing': 0.32; 'anyone': 0.32; 'initial': 0.32; 'too': 0.33; 'to:addr:python-list': 0.33; 'done': 0.34; 'algorithms': 0.34; 'subject:text': 0.35; 'charset:us-ascii': 0.36; 'couple': 0.36; 'file': 0.36; 'languages': 0.37; 'put': 0.37; 'but': 0.37; 'some': 0.38; 'subject:: ': 0.39; 'difficult': 0.39; 'help': 0.39; 'header:Mime-Version:1': 0.39; 'to:addr:python.org': 0.39; 'more': 0.60; 'die': 0.61; 'header:Message-Id:1': 0.61; '29,': 0.67; 'william': 0.68; 'promise': 0.71; 'records': 0.72; 'adhere': 0.84; 'received:69.73': 0.84; 'scatter': 0.84; 'task,': 0.93
Content-Type text/plain; charset=us-ascii
Mime-Version 1.0 (Apple Message framework v1084)
Subject Re: Help parsing a text file
From Philip Semanchuk <philip@semanchuk.com>
In-Reply-To <j3glai$1mu$1@dont-email.me>
Date Mon, 29 Aug 2011 14:31:13 -0400
Content-Transfer-Encoding quoted-printable
References <j3glai$1mu$1@dont-email.me>
To python-list list <python-list@python.org>
X-Mailer Apple Mail (2.1084)
X-AntiAbuse This header was added to track abuse, please include it with any abuse report
X-AntiAbuse Primary Hostname - deimos.nocdirect.com
X-AntiAbuse Original Domain - python.org
X-AntiAbuse Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse Sender Address Domain - semanchuk.com
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.552.1314642679.27778.python-list@python.org> (permalink)
Lines 31
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1314642679 news.xs4all.nl 2434 [2001:888:2000:d::a6]:49905
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:12416

Show key headers only | View raw


On Aug 29, 2011, at 2:21 PM, William Gill wrote:

> I haven't done much with Python for a couple years, bouncing around between other languages and scripts as needs suggest, so I have some minor difficulty keeping Python functionality Python functionality in my head, but I can overcome that as the cobwebs clear.  Though I do seem to keep tripping over the same Py2 -> Py3 syntax changes (old habits die hard).
> 
> I have a text file with XML like records that I need to parse.  By XML like I mean records have proper opening and closing tags. but fields don't have closing tags (they rely on line ends).  Not all fields appear in all records, but they do adhere to a defined sequence.
> 
> My initial passes into Python have been very unfocused (a scatter gun of too many possible directions, yielding very messy results), so I'm asking for some suggestions, or algorithms (possibly even examples)that may help me focus.
> 
> I'm not asking anyone to write my code, just to nudge me toward a more disciplined approach to a common task, and I promise to put in the effort to understand the underlying fundamentals.

If the syntax really is close to XML, would it be all that difficult to convert it to proper XML? Then you have nice libraries like ElementTree to use for parsing.


Cheers
Philip

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Help parsing a text file William Gill <noreply@domain.invalid> - 2011-08-29 14:21 -0400
  Re: Help parsing a text file Philip Semanchuk <philip@semanchuk.com> - 2011-08-29 14:31 -0400
    Re: Help parsing a text file William Gill <nospam@domain.invalid> - 2011-08-29 14:56 -0400
  Re: Help parsing a text file Thomas Jollans <t@jollybox.de> - 2011-08-29 23:05 +0200
    Re: Help parsing a text file "Waldek M." <wm@localhost.localdomain> - 2011-08-30 13:50 +0200
  Re: Help parsing a text file Tim Roberts <timr@probo.com> - 2011-08-30 22:37 -0700
  Re: Help parsing a text file JT <james.thornton@gmail.com> - 2011-09-01 10:58 -0700
    Re: Help parsing a text file William Gill <nospam@domain.invalid> - 2011-09-01 14:38 -0400

csiph-web