Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #49643 > unrolled thread
| Started by | subhabangalore@gmail.com |
|---|---|
| First post | 2013-07-02 10:43 -0700 |
| Last post | 2013-07-03 01:13 +0100 |
| Articles | 4 — 4 participants |
Back to article view | Back to comp.lang.python
HTML Parser subhabangalore@gmail.com - 2013-07-02 10:43 -0700
Re: HTML Parser Neil Cerutti <neilc@norwich.edu> - 2013-07-02 17:57 +0000
Re: HTML Parser Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-02 23:53 +0000
Re: HTML Parser Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-03 01:13 +0100
| From | subhabangalore@gmail.com |
|---|---|
| Date | 2013-07-02 10:43 -0700 |
| Subject | HTML Parser |
| Message-ID | <b4aba93c-f832-4af4-8c48-02b1f8f6b1cd@googlegroups.com> |
Dear Group, I was looking for a good tutorial for a "HTML Parser". My intention was to extract tables from web pages or information from tables in web pages. I tried to make a search, I got HTMLParser, BeautifulSoup, etc. HTMLParser works fine for me, but I am looking for a good tutorial to learn it nicely. I could not use BeautifulSoup as I did not find an .exe file. I am using Python 2.7 on Windows 7 SP1 (64 bit). I am looking for a good tutorial for HTMLParser or any similar parser which have an .exe file for my environment and a good tutorial. If anyone of the learned members can kindly suggest. Thanking You in Advance, Regards, Subhabrata.
[toc] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2013-07-02 17:57 +0000 |
| Message-ID | <b3gik2Fa73jU1@mid.individual.net> |
| In reply to | #49643 |
On 2013-07-02, subhabangalore@gmail.com <subhabangalore@gmail.com> wrote: > Dear Group, > > I was looking for a good tutorial for a "HTML Parser". My > intention was to extract tables from web pages or information > from tables in web pages. > > I tried to make a search, I got HTMLParser, BeautifulSoup, etc. > HTMLParser works fine for me, but I am looking for a good > tutorial to learn it nicely. Take a read of the topic "Parsing, creating, and Manipulating HTML Documents" from chapter five of Text Processing in Python. http://gnosis.cx/TPiP/chap5.txt -- Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-07-02 23:53 +0000 |
| Message-ID | <51d367f5$0$29999$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #49643 |
On Tue, 02 Jul 2013 10:43:03 -0700, subhabangalore wrote: > I could not use BeautifulSoup as I did not find an .exe file. I believe that BeautifulSoup is a pure-Python module, and so does not have a .exe file. However, it does have good tutorials: https://duckduckgo.com/html/?q=beautifulsoup+tutorial > I am looking for a good tutorial for HTMLParser or any similar parser > which have an .exe file for my environment and a good tutorial. Why do you care about a .exe file? Most Python libraries are .py files. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua.landau.ws@gmail.com> |
|---|---|
| Date | 2013-07-03 01:13 +0100 |
| Message-ID | <mailman.4140.1372810479.3114.python-list@python.org> |
| In reply to | #49643 |
On 2 July 2013 18:43, <subhabangalore@gmail.com> wrote: > I could not use BeautifulSoup as I did not find an .exe file. Were you perhaps looking for a .exe file to install BeautifulSoup? It's quite plausible that a windows user like you might be dazzled at the idea of a .tar.gz. I suggest just using "pip install beautifulsoup4" at a command prompt. See http://stackoverflow.com/questions/12228102/how-to-install-beautiful-soup-4-with-python-2-7-on-windows for explanations -- there are links for things you need to know. But basically, use BeautifulSoup. It does what you need.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web