RE: vBulletin scraper -- feasible?

From	Nick Cash <nick.cash@npcinternational.com>
Subject	RE: vBulletin scraper -- feasible?
Date	2012-06-25 19:44 +0000
References	<jsad2l$2eu$1@dont-email.me>
Newsgroups	comp.lang.python
Message-ID	<mailman.1495.1340654376.4697.python-list@python.org> (permalink)

Show all headers | View raw

You may want to look into http://www.crummy.com/software/BeautifulSoup/
It's made for parsing (potentially bad) HTML, and is quite easy to use. I'd say it's quite feasible.

Thanks,
Nick Cash
NPC International

-----Original Message-----
From: python-list-bounces+nick.cash=npcinternational.com@python.org [mailto:python-list-bounces+nick.cash=npcinternational.com@python.org] On Behalf Of Andrew D'Angelo
Sent: Monday, June 25, 2012 14:10
To: python-list@python.org
Subject: vBulletin scraper -- feasible?

Taking a look through vBulletin's HTML, I was wondering whether it would be overly difficult to parse it into nice, manipulatible data.
I'd suppose my ultimate goal would be to dynamically parse a vBulletin and feed it into a locally hosted NNTP server. 


--
http://mail.python.org/mailman/listinfo/python-list

Thread

vBulletin scraper -- feasible? "Andrew D'Angelo" <excel@pharcyde.org> - 2012-06-25 14:10 -0500
  RE: vBulletin scraper -- feasible? Nick Cash <nick.cash@npcinternational.com> - 2012-06-25 19:44 +0000
    Re: vBulletin scraper -- feasible? "Andrew D'Angelo" <excel@pharcyde.org> - 2012-06-26 10:20 -0500

csiph-web