Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #43487
| Path | csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <ramit.prasad@jpmorgan.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.013 |
| X-Spam-Evidence | '*H*': 0.97; '*S*': 0.00; 'completeness': 0.07; 'received:155': 0.09; 'api': 0.11; 'disclaimers': 0.16; 'disclaimers,': 0.16; 'from:addr:jpmorgan.com': 0.16; 'programmatic': 0.16; 'received:155.180': 0.16; 'received:159': 0.16; 'received:159.53': 0.16; 'received:159.53.110': 0.16; 'received:exchad.jpmchase.net': 0.16; 'received:jpmchase.com': 0.16; 'received:jpmchase.net': 0.16; 'securities,': 0.16; 'subject:format': 0.16; 'url:disclosures': 0.16; 'url:jpmorgan': 0.16; 'helpful': 0.24; 'script': 0.25; 'header:In-Reply-To:1': 0.27; 'to:2**1': 0.27; 'wonder': 0.29; 'url:wiki': 0.31; 'extract': 0.31; 'url:wikipedia': 0.31; 'way?': 0.31; 'figure': 0.32; 'received:169.254': 0.32; 'table': 0.34; 'could': 0.34; 'but': 0.35; 'there': 0.35; 'accuracy': 0.36; 'format.': 0.36; 'ubuntu': 0.36; 'charset:us-ascii': 0.36; 'hi,': 0.36; 'url:org': 0.36; 'should': 0.36; 'received:169': 0.37; 'question,': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'release': 0.40; 'information,': 0.61; 'purchase': 0.65; 'to:addr:gmail.com': 0.65; 'bottom': 0.67; 'subject': 0.69; 'legal': 0.71; 'sale': 0.75; 'received:169.254.8': 0.84 |
| X-DKIM | OpenDKIM Filter v2.1.3 sj1.jpmchase.com r3CM3m7B011305 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=jpmorgan.com; s=smtpout; t=1365804228; bh=ewGcbx1r6HF7SjqqClXdY9RmgrTN2BB5EuCvZsC63Uw=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Transfer-Encoding:MIME-Version:Content-Type; b=AP9vfdFg44F3RytA6hnByGrGYmM+f8f3DqT81sJocgc+AmJwyOZv+ghQyjISJ5vpV xQW3vVkoiLeOG/piWdPsBnJkknu6RyRs/KPqMm7nTh4Dci0oOdARHnTo0gxmDxP/iI 6yEFCY1MAXAJn3DSWoH4Hk9S+ZJ33Fhq8xrauiS4= |
| From | "Prasad, Ramit" <ramit.prasad@jpmorgan.com> |
| To | Jabba Laci <jabba.laci@gmail.com>, Python mailing list <python-list@python.org> |
| Subject | RE: extract HTML table in a structured format |
| Thread-Topic | extract HTML table in a structured format |
| Thread-Index | AQHONcen0NI73OtFukacjqbNNcLb3JjTJeyA |
| Date | Fri, 12 Apr 2013 22:00:25 +0000 |
| References | <CAOuJsM=u75nv-TxVCpXdcxmfyhxyY0v-NTYPEeGh1MmMuzxCVg@mail.gmail.com> |
| In-Reply-To | <CAOuJsM=u75nv-TxVCpXdcxmfyhxyY0v-NTYPEeGh1MmMuzxCVg@mail.gmail.com> |
| Accept-Language | en-US |
| Content-Language | en-US |
| X-MS-Has-Attach | |
| X-MS-TNEF-Correlator | |
| x-originating-ip | [10.67.79.47] |
| Content-Transfer-Encoding | quoted-printable |
| MIME-Version | 1.0 |
| X-DLP-FWD | Yes |
| Content-Type | text/plain; charset="us-ascii" |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.538.1365804231.3114.python-list@python.org> (permalink) |
| Lines | 15 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1365804231 news.xs4all.nl 2628 [2001:888:2000:d::a6]:44138 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:43487 |
Show key headers only | View raw
Jabba Laci > Hi, > > I wonder if there is a nice way to extract a whole HTML table and have the result in a nice structured > format. What I want is to have the lifetime table at the bottom of this page: > http://en.wikipedia.org/wiki/List_of_Ubuntu_releases (then figure out with a script until when my > Ubuntu release is supported). > > I could do it with BeautifulSoup or lxml but is there a better way? There should be :) > I know you already answered your question, but thought this might be helpful in the future. Wikipedia has an API for programmatic access. http://www.mediawiki.org/wiki/API ~Ramit This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
RE: extract HTML table in a structured format "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2013-04-12 22:00 +0000
csiph-web