Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #2616

Re: JavaScript and Screenscraping

From Tom Anderson <twic@urchin.earth.li>
Newsgroups comp.lang.java.programmer
Subject Re: JavaScript and Screenscraping
Date 2011-03-31 00:28 +0100
Organization Stack Usenet News Service
Message-ID <alpine.DEB.2.00.1103310028400.9606@urchin.earth.li> (permalink)
References <rvc6p6toumdlevjb48ohjnlf1gur128eqe@4ax.com> <imvekb$s6s$1@news.onet.pl>

Show all headers | View raw


On Wed, 30 Mar 2011, Michal Kleczek wrote:

> Roedy Green wrote:
>
>> I am working on a screenscraping project that is turning out to much
>> more time-consuming that I thought it would be. I am trying to gather
>> a database of information about all the motherboards sold my major
>> manufacturers.  The idea is to eventually create a comparison shopper
>> to help you narrow down models that fit your needs.
>>
>> Oddly motherboard manufacturers don't use a database and generate
>> their specification pages. These are all hand-compiled with theme and
>> a dozen variations on every field. This is can handle.
>>
>> However, Asus decided to obfuscate their web pages with JavaScript.
>> There are no data on them.
>>
>> I wondered if there exists a tool that is like browser in that it will
>> read a page and render the JavaScript, but unlike a browser, it would
>> not show the information on the screen, just dump the generated HTML
>> or raw text and accept a script of pages to analyse.
>
> http://htmlunit.sourceforge.net/

Finally, someone else who knows about it!

tom

-- 
For the first few years I ate lunch with he mathematicians. I soon found
that they were more interested in fun and games than in serious work,
so I shifted to eating with the physics table. There I stayed for a
number of years until the Nobel Prize, promotions, and offers from
other companies, removed most of the interesting people. So I shifted
to the corresponding chemistry table where I had a friend. At first I
asked what were the important problems in chemistry, then what important
problems they were working on, or problems that might lead to important
results. One day I asked, "if what they were working on was not important,
and was not likely to lead to important things, they why were they working
on them?" After that I had to eat with the engineers! -- R. W. Hamming

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 06:51 -0700
  Re: JavaScript and Screenscraping Michal Kleczek <kleku75@gmail.com> - 2011-03-30 16:27 +0200
    Re: JavaScript and Screenscraping Tom Anderson <twic@urchin.earth.li> - 2011-03-31 00:28 +0100
  Re: JavaScript and Screenscraping Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-03-30 07:40 -0700
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 18:27 -0700
  Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-01 23:39 +0100
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-04-01 20:00 -0700
      Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-03 17:27 +0100
  Re: JavaScript and Screenscraping RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-04-05 17:15 +0100

csiph-web