Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #64226 > unrolled thread

python to enable javascript , tried selinium, ghost, pyQt4 already

Started byJaiprakash Singh <jaiprakash@wisepromo.com>
First post2014-01-18 03:54 -0800
Last post2014-01-19 07:58 +0000
Articles 6 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  python to enable javascript , tried selinium, ghost, pyQt4  already Jaiprakash Singh <jaiprakash@wisepromo.com> - 2014-01-18 03:54 -0800
    Re: python to enable javascript , tried selinium, ghost, pyQt4  already Denis McMahon <denismfmcmahon@gmail.com> - 2014-01-18 18:05 +0000
    Re: python to enable javascript , tried selinium, ghost, pyQt4 already Chris Angelico <rosuav@gmail.com> - 2014-01-19 05:13 +1100
      Re: python to enable javascript , tried selinium, ghost, pyQt4 already Denis McMahon <denismfmcmahon@gmail.com> - 2014-01-18 21:40 +0000
        Re: python to enable javascript , tried selinium, ghost, pyQt4 already Chris Angelico <rosuav@gmail.com> - 2014-01-19 09:32 +1100
    Re: python to enable javascript , tried selinium, ghost, pyQt4  already Giorgos Tzampanakis <giorgos.tzampanakis@gmail.com> - 2014-01-19 07:58 +0000

#64226 — python to enable javascript , tried selinium, ghost, pyQt4 already

FromJaiprakash Singh <jaiprakash@wisepromo.com>
Date2014-01-18 03:54 -0800
Subjectpython to enable javascript , tried selinium, ghost, pyQt4 already
Message-ID<91184b5c-aa05-42e1-81de-15252023a15b@googlegroups.com>
hi,

     can you please suggest me some method for  study so that i can scrap a site having JavaScript behind it 


 i have tried selenium, ghost, pyQt4,  but it is slow and as a am working with thread it sinks my ram memory very fast.

[toc] | [next] | [standalone]


#64240

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2014-01-18 18:05 +0000
Message-ID<lbefpo$gmg$1@dont-email.me>
In reply to#64226
On Sat, 18 Jan 2014 03:54:17 -0800, Jaiprakash Singh wrote:

> can you please suggest me some method for  study so that i can
> scrap a site having JavaScript behind it

Please expand upon the requirement, are you trying to:

a) replace server side javascript with server side python, or
b) replace client side javascript with server side python, or
c) replace client side javascript with client side python, or
d) something else?

(c) is not possible (you can't guarantee that all clients will have 
python, or that there will be a mechanism for calling it from your 
webpages), (b) doesn't make a lot of sense (you'll be trading cpu in the 
client for cpu in the server + network bandwidth and latency).

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#64242 — Re: python to enable javascript , tried selinium, ghost, pyQt4 already

FromChris Angelico <rosuav@gmail.com>
Date2014-01-19 05:13 +1100
SubjectRe: python to enable javascript , tried selinium, ghost, pyQt4 already
Message-ID<mailman.5681.1390068840.18130.python-list@python.org>
In reply to#64226
On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
<jaiprakash@wisepromo.com> wrote:
> hi,
>
>      can you please suggest me some method for  study so that i can scrap a site having JavaScript behind it
>
>
>  i have tried selenium, ghost, pyQt4,  but it is slow and as a am working with thread it sinks my ram memory very fast.

Do you mean "scrape"? You're trying to retrieve the displayed contents
of a web page that uses JavaScript? If so, that's basically impossible
without actually executing the JS code, which means largely
replicating the web browser.

ChrisA

[toc] | [prev] | [next] | [standalone]


#64260 — Re: python to enable javascript , tried selinium, ghost, pyQt4 already

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2014-01-18 21:40 +0000
SubjectRe: python to enable javascript , tried selinium, ghost, pyQt4 already
Message-ID<lbescm$to6$3@dont-email.me>
In reply to#64242
On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote:

> On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
> <jaiprakash@wisepromo.com> wrote:
>> hi,
>>
>>      can you please suggest me some method for  study so that i can
>>      scrap a site having JavaScript behind it
>>
>>
>>  i have tried selenium, ghost, pyQt4,  but it is slow and as a am
>>  working with thread it sinks my ram memory very fast.
> 
> Do you mean "scrape"? You're trying to retrieve the displayed contents
> of a web page that uses JavaScript? If so, that's basically impossible
> without actually executing the JS code, which means largely replicating
> the web browser.

Oh, you think he meant scrape? I thought he was trying to scrap (as in 
throw away / replace) an old javascript heavy website with something 
using python instead.

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#64265 — Re: python to enable javascript , tried selinium, ghost, pyQt4 already

FromChris Angelico <rosuav@gmail.com>
Date2014-01-19 09:32 +1100
SubjectRe: python to enable javascript , tried selinium, ghost, pyQt4 already
Message-ID<mailman.5692.1390084364.18130.python-list@python.org>
In reply to#64260
On Sun, Jan 19, 2014 at 8:40 AM, Denis McMahon <denismfmcmahon@gmail.com> wrote:
> On Sun, 19 Jan 2014 05:13:57 +1100, Chris Angelico wrote:
>
>> On Sat, Jan 18, 2014 at 10:54 PM, Jaiprakash Singh
>> <jaiprakash@wisepromo.com> wrote:
>>> hi,
>>>
>>>      can you please suggest me some method for  study so that i can
>>>      scrap a site having JavaScript behind it
>>>
>>>
>>>  i have tried selenium, ghost, pyQt4,  but it is slow and as a am
>>>  working with thread it sinks my ram memory very fast.
>>
>> Do you mean "scrape"? You're trying to retrieve the displayed contents
>> of a web page that uses JavaScript? If so, that's basically impossible
>> without actually executing the JS code, which means largely replicating
>> the web browser.
>
> Oh, you think he meant scrape? I thought he was trying to scrap (as in
> throw away / replace) an old javascript heavy website with something
> using python instead.

I thought so too at first, but since we had another recent case of
someone confusing the two words, and since "scrape" would make sense
in this context, I figured it'd be worth asking the question.

ChrisA

[toc] | [prev] | [next] | [standalone]


#64286

FromGiorgos Tzampanakis <giorgos.tzampanakis@gmail.com>
Date2014-01-19 07:58 +0000
Message-ID<slrnldn1dm.7d0.giorgos.tzampanakis@brilliance.eternal-september.org>
In reply to#64226
On 2014-01-18, Jaiprakash Singh wrote:

> hi,
>
>      can you please suggest me some method for  study so that i can
>      scrap a site having JavaScript behind it 
>
>
>  i have tried selenium, ghost, pyQt4,  but it is slow and as a am
>  working with thread it sinks my ram memory very fast.

I have tried selenium in the past and I remember it working reasonably
well. I am afraid you can't get around the slowness since you have to have
a web browser running.

-- 
Improve at backgammon rapidly through addictive quickfire position quizzes:
http://www.bgtrain.com/

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web