Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #76156
| From | Roy Smith <roy@panix.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: Suitable Python code to scrape specific details from web pages. |
| Date | 2014-08-12 20:30 -0400 |
| Organization | PANIX Public Access Internet and UNIX, NYC |
| Message-ID | <roy-008918.20303912082014@news.panix.com> (permalink) |
| References | <a8f10c4f-d4a0-48ed-ae92-2a43e9a094c3@googlegroups.com> <e2011de5-10fa-4de1-89fa-4e41882a6646@googlegroups.com> <53eaab7d$0$29979$c3e8da3$5496439d@news.astraweb.com> |
In article <53eaab7d$0$29979$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > By studying how other scraping programs work, and studying how your racing > pages store data, you should be able to put the two together and see how to > get the data you want. It's also worth mentioning, that some web sites *want* you to have their data, and make it easy to do so by exposing it via public APIs or other download methods. Wikipedia. Many government web sites. Twitter. Facebook. Reddit. Whenever you start thinking about web scraping, it's always worth spending a little time investigating if such an API exists. If it does, that's where you want to go. If not, well, there's always Beautiful Soup :-)
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Suitable Python code to scrape specific details from web pages. Simon Evans <musicalhacksaw@yahoo.co.uk> - 2014-08-12 13:00 -0700
Re: Suitable Python code to scrape specific details from web pages. Rob Gaddi <rgaddi@technologyhighland.invalid> - 2014-08-12 13:11 -0700
Re: Suitable Python code to scrape specific details from web pages. Roy Smith <roy@panix.com> - 2014-08-12 17:28 -0400
Re: Suitable Python code to scrape specific details from web pages. alex23 <wuwei23@gmail.com> - 2014-08-18 15:04 +1000
Re: Suitable Python code to scrape specific details from web pages. Simon Evans <musicalhacksaw@yahoo.co.uk> - 2014-08-12 15:44 -0700
Re: Suitable Python code to scrape specific details from web pages. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-13 10:04 +1000
Re: Suitable Python code to scrape specific details from web pages. Roy Smith <roy@panix.com> - 2014-08-12 20:30 -0400
Re: Suitable Python code to scrape specific details from web pages. Peter Pearson <ppearson@nowhere.invalid> - 2014-08-13 00:50 +0000
Re: Suitable Python code to scrape specific details from web pages. Denis McMahon <denismfmcmahon@gmail.com> - 2014-08-13 14:53 +0000
csiph-web