Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #22075

RE: Fetching data from a HTML file

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <ramit.prasad@jpmorgan.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.011
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'completeness': 0.05; 'snippet': 0.07; 'fetch': 0.09; 'framework,': 0.09; 'subject:file': 0.13; 'url:software': 0.13; 'to:name:python- list@python.org': 0.15; '(read': 0.16; '712': 0.16; 'currencies': 0.16; 'disclaimers': 0.16; 'disclaimers,': 0.16; 'from:addr:jpmorgan.com': 0.16; 'received:155.180': 0.16; 'received:159.53': 0.16; 'received:exchad.jpmchase.net': 0.16; 'received:jpmchase.com': 0.16; 'received:jpmchase.net': 0.16; 'securities,': 0.16; 'soup': 0.16; 'url:disclosures': 0.16; 'url:jpmorgan': 0.16; "haven't": 0.20; 'trying': 0.21; 'header:In- Reply-To:1': 0.22; 'subject:data': 0.25; 'figure': 0.26; 'received:169': 0.28; 'received:169.254': 0.28; "i'm": 0.28; 'source,': 0.29; 'seem': 0.29; 'html.': 0.30; 'received:155': 0.30; 'received:159': 0.30; 'specifically': 0.30; 'accuracy': 0.32; "i've": 0.32; "can't": 0.33; 'match': 0.34; 'to:addr:python- list': 0.35; 'phone:': 0.35; 'with.': 0.37; 'but': 0.37; 'charset :us-ascii': 0.37; 'could': 0.38; 'data': 0.38; 'subject:from': 0.39; 'suggestions': 0.39; 'to:addr:python.org': 0.40; 'subject': 0.61; 'offers': 0.62; 'below': 0.62; 'groups.': 0.68; 'dealing': 0.69; 'response.': 0.69; 'information,': 0.69; 'beautiful': 0.71; 'legal': 0.72; 'url:email': 0.72; 'bank': 0.75; 'sale': 0.75; 'investment': 0.77; 'purchase': 0.78; 'received:169.254.8': 0.84
X-DKIM OpenDKIM Filter v2.1.3 sz1.jpmchase.com q2NFBgq7006394
DKIM-Signature v=1; a=rsa-sha256; c=simple/simple; d=jpmorgan.com; s=smtpout; t=1332515502; bh=3j+L9vt1QsHGtYOj2jqU+nAfWmNh1lqDYSdxqpuJYsA=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Transfer-Encoding:MIME-Version:Content-Type; b=Phh9L12JiCxHgEl3NEK2P5ghcFxmGJnBxS6ZDoF/Bt3xFxQ3sfUK6j1qD+UmDweRu OEBM1SFYIouUqH8BM4WxoHqEA0z0fCXIMMwDgPSgmdLhgUVRPJzi/BmIBqJ4TJa5oU 6PXVmx4sxA1iuGA/6tO/VzwMUp4/lWBC9YJFnaIQ=
From "Prasad, Ramit" <ramit.prasad@jpmorgan.com>
To "python-list@python.org" <python-list@python.org>
Subject RE: Fetching data from a HTML file
Thread-Topic Fetching data from a HTML file
Thread-Index AQHNCPyejmRqygZWUkGx41GSB4v8w5Z3+wDA
Date Fri, 23 Mar 2012 15:08:03 +0000
References <9362386.1094.1332510725414.JavaMail.geo-discussion-forums@ynlt15>
In-Reply-To <9362386.1094.1332510725414.JavaMail.geo-discussion-forums@ynlt15>
Accept-Language en-US
Content-Language en-US
X-MS-Has-Attach
X-MS-TNEF-Correlator
x-originating-ip [10.67.79.38]
Content-Transfer-Encoding quoted-printable
MIME-Version 1.0
X-DLP-FWD Yes
Content-Type text/plain; charset="us-ascii"
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.928.1332515505.3037.python-list@python.org> (permalink)
Lines 17
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1332515505 news.xs4all.nl 6845 [2001:888:2000:d::a6]:33395
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:22075

Show key headers only | View raw


> Actually, I'm working on ROBOT Framework, and haven't been able to figure
> out how to read data from HTML tables. Reading from the source, is the best
> (read rudimentary) way I could come up with. Any suggestions are welcome!

> I've got to fetch data from the snippet below and have been trying to match
> the digits in this to specifically to specific groups. But I can't seem to
> figure how to go about stripping the tags! :(

In addition to Simon's response. You may want to look at Beautiful Soup 
which I hear is good at dealing with malformed HTML.
http://www.crummy.com/software/BeautifulSoup/



Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Fetching data from a HTML file Sangeet <mrsangeet@gmail.com> - 2012-03-23 06:52 -0700
  RE: Fetching data from a HTML file "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-03-23 15:08 +0000
  Re: Fetching data from a HTML file Daniel Fetchinson <fetchinson@googlemail.com> - 2012-03-23 16:28 +0100
  Re: Fetching data from a HTML file Jon Clements <joncle@googlemail.com> - 2012-03-23 22:12 -0700
    Re: Fetching data from a HTML file John Nagle <nagle@animats.com> - 2012-03-24 14:04 -0700

csiph-web