Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #3075 > unrolled thread
| Started by | Chris Rebert <clp2@rebertia.com> |
|---|---|
| First post | 2011-04-12 11:30 -0700 |
| Last post | 2011-04-12 11:30 -0700 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: download web pages that are updated by ajax Chris Rebert <clp2@rebertia.com> - 2011-04-12 11:30 -0700
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2011-04-12 11:30 -0700 |
| Subject | Re: download web pages that are updated by ajax |
| Message-ID | <mailman.273.1302633041.9059.python-list@python.org> |
On Tue, Apr 12, 2011 at 7:47 AM, Jabba Laci <jabba.laci@gmail.com> wrote: > Hi, > > I want to download a web page that is updated by AJAX. The page > requires no human interaction, it is updated automatically: > http://www.ncbi.nlm.nih.gov/nuccore/CP002059.1 > > If I download it with wget, I get a file of size 97 KB. The source is > full of AJAX calls, i.e. the content of the page is not expanded. > If I open it in a browser and save it manually, the result is a file > of almost 5 MB whose content is expanded. > > (1) How to download such a page with Python? I need the post-AJAX > version of the page. I've heard you can drive a web browser using Selenium (http://code.google.com/p/selenium/ ), have it visit the webpage and run the JavaScript on it, and then grab the final result. Cheers, Chris
Back to top | Article view | comp.lang.python
csiph-web