Groups > comp.lang.python > #43487 > unrolled thread

RE: extract HTML table in a structured format

Started by	"Prasad, Ramit" <ramit.prasad@jpmorgan.com>
First post	2013-04-12 22:00 +0000
Last post	2013-04-12 22:00 +0000
Articles	1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  RE: extract HTML table in a structured format "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2013-04-12 22:00 +0000

#43487 — RE: extract HTML table in a structured format

From	"Prasad, Ramit" <ramit.prasad@jpmorgan.com>
Date	2013-04-12 22:00 +0000
Subject	RE: extract HTML table in a structured format
Message-ID	<mailman.538.1365804231.3114.python-list@python.org>

Jabba Laci
> Hi,
> 
> I wonder if there is a nice way to extract a whole HTML table and have the result in a nice structured
> format. What I want is to have the lifetime table at the bottom of this page:
> http://en.wikipedia.org/wiki/List_of_Ubuntu_releases (then figure out with a script until when my
> Ubuntu release is supported).
> 
> I could do it with BeautifulSoup or lxml but is there a better way? There should be :)
> 

I know you already answered your question, but thought this might be helpful
in the future.

Wikipedia has an API for programmatic access.
http://www.mediawiki.org/wiki/API 


~Ramit


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.

[toc] | [standalone]

csiph-web

RE: extract HTML table in a structured format

Contents

#43487 — RE: extract HTML table in a structured format