Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #2589

Re: JavaScript and Screenscraping

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!194.25.134.126.MISMATCH!newsfeed01.sul.t-online.de!newsfeed00.sul.t-online.de!t-online.de!news.nask.pl!news.nask.org.pl!news.cyf-kr.edu.pl!agh.edu.pl!news.agh.edu.pl!news.onet.pl!.POSTED!not-for-mail
From Michal Kleczek <kleku75@gmail.com>
Newsgroups comp.lang.java.programmer
Subject Re: JavaScript and Screenscraping
Followup-To comp.lang.java.programmer
Date Wed, 30 Mar 2011 16:27:23 +0200
Organization http://onet.pl
Lines 25
Message-ID <imvekb$s6s$1@news.onet.pl> (permalink)
References <rvc6p6toumdlevjb48ohjnlf1gur128eqe@4ax.com>
NNTP-Posting-Host 77-252-124-164.ip.netia.com.pl
Mime-Version 1.0
Content-Type text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding 7Bit
X-Trace news.onet.pl 1301495243 28892 77.252.124.164 (30 Mar 2011 14:27:23 GMT)
X-Complaints-To niusy@onet.pl
NNTP-Posting-Date Wed, 30 Mar 2011 14:27:23 +0000 (UTC)
User-Agent KNode/4.4.9
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:2589

Followups directed to: comp.lang.java.programmer

Show key headers only | View raw


Roedy Green wrote:

> I am working on a screenscraping project that is turning out to much
> more time-consuming that I thought it would be. I am trying to gather
> a database of information about all the motherboards sold my major
> manufacturers.  The idea is to eventually create a comparison shopper
> to help you narrow down models that fit your needs.
> 
> Oddly motherboard manufacturers don't use a database and generate
> their specification pages. These are all hand-compiled with theme and
> a dozen variations on every field. This is can handle.
> 
> However, Asus decided to obfuscate their web pages with JavaScript.
> There are no data on them.
> 
> I wondered if there exists a tool that is like browser in that it will
> read a page and render the JavaScript, but unlike a browser, it would
> not show the information on the screen, just dump the generated HTML
> or raw text and accept a script of pages to analyse.
> 

http://htmlunit.sourceforge.net/

-- 
Michal

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 06:51 -0700
  Re: JavaScript and Screenscraping Michal Kleczek <kleku75@gmail.com> - 2011-03-30 16:27 +0200
    Re: JavaScript and Screenscraping Tom Anderson <twic@urchin.earth.li> - 2011-03-31 00:28 +0100
  Re: JavaScript and Screenscraping Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-03-30 07:40 -0700
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 18:27 -0700
  Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-01 23:39 +0100
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-04-01 20:00 -0700
      Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-03 17:27 +0100
  Re: JavaScript and Screenscraping RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-04-05 17:15 +0100

csiph-web