Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #49643 > unrolled thread

HTML Parser

Started bysubhabangalore@gmail.com
First post2013-07-02 10:43 -0700
Last post2013-07-03 01:13 +0100
Articles 4 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  HTML Parser subhabangalore@gmail.com - 2013-07-02 10:43 -0700
    Re: HTML Parser Neil Cerutti <neilc@norwich.edu> - 2013-07-02 17:57 +0000
    Re: HTML Parser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-02 23:53 +0000
    Re: HTML Parser Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-03 01:13 +0100

#49643 — HTML Parser

Fromsubhabangalore@gmail.com
Date2013-07-02 10:43 -0700
SubjectHTML Parser
Message-ID<b4aba93c-f832-4af4-8c48-02b1f8f6b1cd@googlegroups.com>
Dear Group,

I was looking for a good tutorial for a "HTML Parser". My intention was to extract tables from web pages or information from tables in web pages. 

I tried to make a search, I got HTMLParser, BeautifulSoup, etc. HTMLParser works fine for me, but I am looking for a good tutorial to learn it nicely.

I could not use BeautifulSoup as I did not find an .exe file. 

I am using Python 2.7 on Windows 7 SP1 (64 bit). 

I am looking for a good tutorial for HTMLParser or any similar parser which have an .exe file for my environment and a good tutorial.

If anyone of the learned members can kindly suggest.

Thanking You in Advance,
Regards,
Subhabrata.

[toc] | [next] | [standalone]


#49644

FromNeil Cerutti <neilc@norwich.edu>
Date2013-07-02 17:57 +0000
Message-ID<b3gik2Fa73jU1@mid.individual.net>
In reply to#49643
On 2013-07-02, subhabangalore@gmail.com <subhabangalore@gmail.com> wrote:
> Dear Group,
>
> I was looking for a good tutorial for a "HTML Parser". My
> intention was to extract tables from web pages or information
> from tables in web pages. 
>
> I tried to make a search, I got HTMLParser, BeautifulSoup, etc.
> HTMLParser works fine for me, but I am looking for a good
> tutorial to learn it nicely.

Take a read of the topic "Parsing, creating, and Manipulating
HTML Documents" from chapter five of Text Processing in Python.

http://gnosis.cx/TPiP/chap5.txt

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#49670

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-07-02 23:53 +0000
Message-ID<51d367f5$0$29999$c3e8da3$5496439d@news.astraweb.com>
In reply to#49643
On Tue, 02 Jul 2013 10:43:03 -0700, subhabangalore wrote:

> I could not use BeautifulSoup as I did not find an .exe file.

I believe that BeautifulSoup is a pure-Python module, and so does not 
have a .exe file. However, it does have good tutorials:

https://duckduckgo.com/html/?q=beautifulsoup+tutorial


> I am looking for a good tutorial for HTMLParser or any similar parser
> which have an .exe file for my environment and a good tutorial.

Why do you care about a .exe file? Most Python libraries are .py files.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#49674

FromJoshua Landau <joshua.landau.ws@gmail.com>
Date2013-07-03 01:13 +0100
Message-ID<mailman.4140.1372810479.3114.python-list@python.org>
In reply to#49643
On 2 July 2013 18:43,  <subhabangalore@gmail.com> wrote:
> I could not use BeautifulSoup as I did not find an .exe file.

Were you perhaps looking for a .exe file to install BeautifulSoup?
It's quite plausible that a windows user like you might be dazzled at
the idea of a .tar.gz.

I suggest just using "pip install beautifulsoup4" at a command prompt.
See http://stackoverflow.com/questions/12228102/how-to-install-beautiful-soup-4-with-python-2-7-on-windows
for explanations -- there are links for things you need to know.

But basically, use BeautifulSoup. It does what you need.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web