Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #64226 > unrolled thread
| Started by | Jaiprakash Singh <jaiprakash@wisepromo.com> |
|---|---|
| First post | 2014-01-18 03:54 -0800 |
| Last post | 2014-01-19 07:58 +0000 |
| Articles | 6 — 4 participants |
Back to article view | Back to comp.lang.python
python to enable javascript , tried selinium, ghost, pyQt4 already Jaiprakash Singh <jaiprakash@wisepromo.com> - 2014-01-18 03:54 -0800
Re: python to enable javascript , tried selinium, ghost, pyQt4 already Denis McMahon <denismfmcmahon@gmail.com> - 2014-01-18 18:05 +0000
Re: python to enable javascript , tried selinium, ghost, pyQt4 already Chris Angelico <rosuav@gmail.com> - 2014-01-19 05:13 +1100
Re: python to enable javascript , tried selinium, ghost, pyQt4 already Denis McMahon <denismfmcmahon@gmail.com> - 2014-01-18 21:40 +0000
Re: python to enable javascript , tried selinium, ghost, pyQt4 already Chris Angelico <rosuav@gmail.com> - 2014-01-19 09:32 +1100
Re: python to enable javascript , tried selinium, ghost, pyQt4 already Giorgos Tzampanakis <giorgos.tzampanakis@gmail.com> - 2014-01-19 07:58 +0000
| From | Jaiprakash Singh <jaiprakash@wisepromo.com> |
|---|---|
| Date | 2014-01-18 03:54 -0800 |
| Subject | python to enable javascript , tried selinium, ghost, pyQt4 already |
| Message-ID | <91184b5c-aa05-42e1-81de-15252023a15b@googlegroups.com> |
hi,
can you please suggest me some method for study so that i can scrap a site having JavaScript behind it
i have tried selenium, ghost, pyQt4, but it is slow and as a am working with thread it sinks my ram memory very fast.
[toc] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2014-01-18 18:05 +0000 |
| Message-ID | <lbefpo$gmg$1@dont-email.me> |
| In reply to | #64226 |
On Sat, 18 Jan 2014 03:54:17 -0800, Jaiprakash Singh wrote: > can you please suggest me some method for study so that i can > scrap a site having JavaScript behind it Please expand upon the requirement, are you trying to: a) replace server side javascript with server side python, or b) replace client side javascript with server side python, or c) replace client side javascript with client side python, or d) something else? (c) is not possible (you can't guarantee that all clients will have python, or that there will be a mechanism for calling it from your webpages), (b) doesn't make a lot of sense (you'll be trading cpu in the client for cpu in the server + network bandwidth and latency). -- Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-01-19 05:13 +1100 |
| Subject | Re: python to enable javascript , tried selinium, ghost, pyQt4 already |
| Message-ID | <mailman.5681.1390068840.18130.python-list@python.org> |
| In reply to | #64226 |
On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh <jaiprakash@wisepromo.com> wrote: > hi, > > can you please suggest me some method for study so that i can scrap a site having JavaScript behind it > > > i have tried selenium, ghost, pyQt4, but it is slow and as a am working with thread it sinks my ram memory very fast. Do you mean "scrape"? You're trying to retrieve the displayed contents of a web page that uses JavaScript? If so, that's basically impossible without actually executing the JS code, which means largely replicating the web browser. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2014-01-18 21:40 +0000 |
| Subject | Re: python to enable javascript , tried selinium, ghost, pyQt4 already |
| Message-ID | <lbescm$to6$3@dont-email.me> |
| In reply to | #64242 |
On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote: > On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh > <jaiprakash@wisepromo.com> wrote: >> hi, >> >> can you please suggest me some method for study so that i can >> scrap a site having JavaScript behind it >> >> >> i have tried selenium, ghost, pyQt4, but it is slow and as a am >> working with thread it sinks my ram memory very fast. > > Do you mean "scrape"? You're trying to retrieve the displayed contents > of a web page that uses JavaScript? If so, that's basically impossible > without actually executing the JS code, which means largely replicating > the web browser. Oh, you think he meant scrape? I thought he was trying to scrap (as in throw away / replace) an old javascript heavy website with something using python instead. -- Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-01-19 09:32 +1100 |
| Subject | Re: python to enable javascript , tried selinium, ghost, pyQt4 already |
| Message-ID | <mailman.5692.1390084364.18130.python-list@python.org> |
| In reply to | #64260 |
On Sun, Jan 19, 2014 at 8:40 AM, Denis McMahon <denismfmcmahon@gmail.com> wrote: > On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote: > >> On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh >> <jaiprakash@wisepromo.com> wrote: >>> hi, >>> >>> can you please suggest me some method for study so that i can >>> scrap a site having JavaScript behind it >>> >>> >>> i have tried selenium, ghost, pyQt4, but it is slow and as a am >>> working with thread it sinks my ram memory very fast. >> >> Do you mean "scrape"? You're trying to retrieve the displayed contents >> of a web page that uses JavaScript? If so, that's basically impossible >> without actually executing the JS code, which means largely replicating >> the web browser. > > Oh, you think he meant scrape? I thought he was trying to scrap (as in > throw away / replace) an old javascript heavy website with something > using python instead. I thought so too at first, but since we had another recent case of someone confusing the two words, and since "scrape" would make sense in this context, I figured it'd be worth asking the question. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Giorgos Tzampanakis <giorgos.tzampanakis@gmail.com> |
|---|---|
| Date | 2014-01-19 07:58 +0000 |
| Message-ID | <slrnldn1dm.7d0.giorgos.tzampanakis@brilliance.eternal-september.org> |
| In reply to | #64226 |
On 2014-01-18, Jaiprakash Singh wrote: > hi, > > can you please suggest me some method for study so that i can > scrap a site having JavaScript behind it > > > i have tried selenium, ghost, pyQt4, but it is slow and as a am > working with thread it sinks my ram memory very fast. I have tried selenium in the past and I remember it working reasonably well. I am afraid you can't get around the slowness since you have to have a web browser running. -- Improve at backgammon rapidly through addictive quickfire position quizzes: http://www.bgtrain.com/
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web