Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #91132
| References | <CAAXuHoeJ-YMQDB85qLDJ_o+9CrsfwLvm9wuOaRtbSj-i9kBaFA@mail.gmail.com> <CAPTjJmqBDu0nB2u_mf7KMpFzxxJFUqB7o-7dJiGgu-xyOL2uzg@mail.gmail.com> <CAAXuHod4NH74+VK1EY78-vCOyiOA3879qp1uPqKOFit7qVE5sQ@mail.gmail.com> |
|---|---|
| Date | 2015-05-23 10:29 -0400 |
| Subject | Re: Extract email address from Java script in html source using python |
| From | Joel Goldstick <joel.goldstick@gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.277.1432391369.17265.python-list@python.org> (permalink) |
On Sat, May 23, 2015 at 10:15 AM, savitha devi <savithad8@gmail.com> wrote:
> What I exactly want is the java script is in the html code. I am trying for
> a regular expression to find the email address embedded with in the java
> script.
>
> On Sat, May 23, 2015 at 2:31 PM, Chris Angelico <rosuav@gmail.com> wrote:
>>
>> On Sat, May 23, 2015 at 4:46 PM, savitha devi <savithad8@gmail.com> wrote:
>> > I am developing a web scraper code using HTMLParser. I need to extract
>> > text/email address from java script with in the HTMLCode.I am beginner
>> > level
>> > in python coding and totally lost here. Need some help on this. The java
>> > script code is as below:
>> >
>> > <script type='text/javascript'>
>> > //<!--
>> > document.getElementById('cloak48218').innerHTML = '';
>> > var prefix = 'ma' + 'il' + 'to';
>> > var path = 'hr' + 'ef' + '=';
>> > var addy48218 = 'info' + '@';
>> > addy48218 = addy48218 + 'tsv-neuried' + '.' +
>> > 'de';
>> > document.getElementById('cloak48218').innerHTML += '<a ' + path + '\''
>> > +
>> > prefix + ':' + addy48218 + '\'>' + addy48218+'<\/a>';
>> > //-->
>>
>> This is deliberately being done to prevent scripted usage. What
>> exactly are you needing to do this for?
>>
>> You're basically going to have to execute the entire block of
>> JavaScript code, and then decode the entities to get to what you want.
>> Doing it manually is pretty easy; doing it automatically will
>> virtually require a language interpreter.
>>
>> ChrisA
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
I've not used it, but doesn't Selenium help do this? From what I
understand it gets the resultant html of a web page after the
javascript has run
--
Joel Goldstick
http://joelgoldstick.com
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Extract email address from Java script in html source using python Joel Goldstick <joel.goldstick@gmail.com> - 2015-05-23 10:29 -0400
csiph-web