Groups | Search | Server Info | Keyboard shortcuts | Login | Register
| From | Peter Flynn <peter@silmaril.ie> |
|---|---|
| Newsgroups | comp.text.xml |
| Subject | Re: Parsing XML pages |
| Date | 2016-08-01 13:30 +0100 |
| Organization | Silmaril Consultants |
| Message-ID | <e08tmuF72r5U1@mid.individual.net> (permalink) |
| References | <dab6c93a-17f7-4edb-bad5-213db52ee0fd@googlegroups.com> |
On 24/07/16 03:59, paolopiace@gmail.com wrote: > This url > > http://finance.yahoo.com/quote/GE/history?period1=0&period2=1469170800&interval=div|split&filter=split&frequency=1d > > outputs a page which at its bottom has this content: > > https://1drv.ms/i/s!AhvJcZiY8TTdhWx_35S5R2hZ99BX > > I save the source page html and search some strings in it. > I search "3/1", "Stock Split", "May 16, 1994" and so on. > > Well, nothing like this is in the source page! Unsurprising, given that it's financial. > Where the hell are those info? They are being inserted in real time from an external source, probably via Javascript. > If I see them on the browser, they must be stored somewhere. Or they might be being calculated on-the-fly, from *data* stored elsewhere. > If not in the html source page, where are they? They have been deliberately obfuscated so that you can't steal them. > May I have some directions, please? Use a browser which has an Inspection mode. Right-click one of the values and look at the pseudo-HTML: <td class="Ta(c) Py(10px)" colspan="5" data-reactid=".1kvth1ckyua.1.$0.0.0.3.1.$main-0-Quote-Proxy.$main-0-Quote.0.2.0.2.$history-table.1.$0.1"><strong data-reactid=".1kvth1ckyua.1.$0.0.0.3.1.$main-0-Quote-Proxy.$main-0-Quote.0.2.0.2.$history-table.1.$0.1.0">3/1</strong><span data-reactid=".1kvth1ckyua.1.$0.0.0.3.1.$main-0-Quote-Proxy.$main-0-Quote.0.2.0.2.$history-table.1.$0.1.1"> </span><span data-reactid=".1kvth1ckyua.1.$0.0.0.3.1.$main-0-Quote-Proxy.$main-0-Quote.0.2.0.2.$history-table.1.$0.1.2">Stock Split</span></td> etc. Now go find the code which recognises this, unscramble it, and find out what machine it's coming from. Then break into the machine to get at the source data (just kidding, NSA :-) Good luck... ///Peter
Back to comp.text.xml | Previous | Next — Previous in thread | Next in thread | Find similar
Parsing XML pages paolopiace@gmail.com - 2016-07-23 19:59 -0700 Re: Parsing XML pages Peter Flynn <peter@silmaril.ie> - 2016-08-01 13:30 +0100 Re: Parsing XML pages Luuk <luuk@invalid.lan> - 2016-08-06 11:43 +0200
csiph-web