Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #24438 > unrolled thread
| Started by | "Andrew D'Angelo" <excel@pharcyde.org> |
|---|---|
| First post | 2012-06-25 14:10 -0500 |
| Last post | 2012-06-26 10:20 -0500 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
vBulletin scraper -- feasible? "Andrew D'Angelo" <excel@pharcyde.org> - 2012-06-25 14:10 -0500
RE: vBulletin scraper -- feasible? Nick Cash <nick.cash@npcinternational.com> - 2012-06-25 19:44 +0000
Re: vBulletin scraper -- feasible? "Andrew D'Angelo" <excel@pharcyde.org> - 2012-06-26 10:20 -0500
| From | "Andrew D'Angelo" <excel@pharcyde.org> |
|---|---|
| Date | 2012-06-25 14:10 -0500 |
| Subject | vBulletin scraper -- feasible? |
| Message-ID | <jsad2l$2eu$1@dont-email.me> |
Taking a look through vBulletin's HTML, I was wondering whether it would be overly difficult to parse it into nice, manipulatible data. I'd suppose my ultimate goal would be to dynamically parse a vBulletin and feed it into a locally hosted NNTP server.
[toc] | [next] | [standalone]
| From | Nick Cash <nick.cash@npcinternational.com> |
|---|---|
| Date | 2012-06-25 19:44 +0000 |
| Message-ID | <mailman.1495.1340654376.4697.python-list@python.org> |
| In reply to | #24438 |
You may want to look into http://www.crummy.com/software/BeautifulSoup/ It's made for parsing (potentially bad) HTML, and is quite easy to use. I'd say it's quite feasible. Thanks, Nick Cash NPC International -----Original Message----- From: python-list-bounces+nick.cash=npcinternational.com@python.org [mailto:python-list-bounces+nick.cash=npcinternational.com@python.org] On Behalf Of Andrew D'Angelo Sent: Monday, June 25, 2012 14:10 To: python-list@python.org Subject: vBulletin scraper -- feasible? Taking a look through vBulletin's HTML, I was wondering whether it would be overly difficult to parse it into nice, manipulatible data. I'd suppose my ultimate goal would be to dynamically parse a vBulletin and feed it into a locally hosted NNTP server. -- http://mail.python.org/mailman/listinfo/python-list
[toc] | [prev] | [next] | [standalone]
| From | "Andrew D'Angelo" <excel@pharcyde.org> |
|---|---|
| Date | 2012-06-26 10:20 -0500 |
| Message-ID | <jscjv4$2u6$1@dont-email.me> |
| In reply to | #24441 |
Thanks, this seems to be just what I need.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web