Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #21289
| References | <12783654.1174.1331073814011.JavaMail.geo-discussion-forums@yner4> <mailman.442.1331074333.3037.python-list@python.org> <mailman.443.1331074966.3037.python-list@python.org> <28285433.1413.1331075139309.JavaMail.geo-discussion-forums@ynbq18> |
|---|---|
| From | Ian Kelly <ian.g.kelly@gmail.com> |
| Date | 2012-03-06 16:35 -0700 |
| Subject | Re: What's the best way to write this regular expression? |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.447.1331076963.3037.python-list@python.org> (permalink) |
On Tue, Mar 6, 2012 at 4:05 PM, John Salerno <johnjsal@gmail.com> wrote: >> Anything that allows me NOT to use REs is welcome news, so I look forward to learning about something new! :) > > I should ask though...are there alternatives already bundled with Python that I could use? Now that you mention it, I remember something called HTMLParser (or something like that) and I have no idea why I never looked into that before I messed with REs. HTMLParser is pretty basic, although it may be sufficient for your needs. It just converts an html document into a stream of start tags, end tags, and text, with no guarantee that the tags will actually correspond in any meaningful way. lxml can be used to output an actual hierarchical structure that may be easier to manipulate and extract data from. Cheers, Ian
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 14:43 -0800
Re: What's the best way to write this regular expression? Chris Rebert <clp2@rebertia.com> - 2012-03-06 14:52 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:02 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:05 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:25 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:33 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:33 -0800
Re: What's the best way to write this regular expression? Ian Kelly <ian.g.kelly@gmail.com> - 2012-03-06 16:35 -0700
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 17:39 -0600
Re: What's the best way to write this regular expression? Terry Reedy <tjreedy@udel.edu> - 2012-03-06 20:04 -0500
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:05 -0800
Re: What's the best way to write this regular expression? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-03-06 23:44 +0000
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:57 -0800
RE: What's the best way to write this regular expression? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-03-07 00:04 +0000
Re: What's the best way to write this regular expression? Terry Reedy <tjreedy@udel.edu> - 2012-03-06 20:06 -0500
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 15:02 -0800
Re: What's the best way to write this regular expression? Roy Smith <roy@panix.com> - 2012-03-06 20:26 -0500
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-06 23:02 -0800
Re: What's the best way to write this regular expression? Paul Rubin <no.email@nospam.invalid> - 2012-03-07 02:36 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-07 12:39 -0800
Re: What's the best way to write this regular expression? Ian Kelly <ian.g.kelly@gmail.com> - 2012-03-07 14:01 -0700
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-07 15:11 -0600
Re: What's the best way to write this regular expression? alex23 <wuwei23@gmail.com> - 2012-03-08 19:38 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-08 19:52 -0800
Re: What's the best way to write this regular expression? Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-03-07 16:27 -0500
RE: What's the best way to write this regular expression? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-03-07 21:31 +0000
Re: What's the best way to write this regular expression? Ian Kelly <ian.g.kelly@gmail.com> - 2012-03-07 14:34 -0700
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-07 15:44 -0600
Re: RE: What's the best way to write this regular expression? Evan Driscoll <driscoll@cs.wisc.edu> - 2012-03-07 16:02 -0600
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-07 23:26 -0800
Re: What's the best way to write this regular expression? Chris Angelico <rosuav@gmail.com> - 2012-03-08 16:03 +1100
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-07 23:25 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-08 13:33 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-08 13:40 -0800
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-08 13:52 -0800
Re: What's the best way to write this regular expression? John Gordon <gordon@panix.com> - 2012-03-08 21:54 +0000
Re: What's the best way to write this regular expression? Dave Angel <d@davea.name> - 2012-03-08 17:19 -0500
Re: What's the best way to write this regular expression? John Salerno <johnjsal@gmail.com> - 2012-03-08 16:25 -0600
RE: What's the best way to write this regular expression? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-03-08 23:02 +0000
Re: What's the best way to write this regular expression? Dave Angel <d@davea.name> - 2012-03-08 18:23 -0500
Re: What's the best way to write this regular expression? Ethan Furman <ethan@stoneleaf.us> - 2012-03-08 14:52 -0800
Re: What's the best way to write this regular expression? jkn <jkn_gg@nicorp.f9.co.uk> - 2012-03-09 02:45 -0800
csiph-web