Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #2592

Re: JavaScript and Screenscraping

NNTP-Posting-Date Wed, 30 Mar 2011 09:40:32 -0500
Date Wed, 30 Mar 2011 07:40:32 -0700
From Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com>
User-Agent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7
MIME-Version 1.0
Newsgroups comp.lang.java.programmer
Subject Re: JavaScript and Screenscraping
References <rvc6p6toumdlevjb48ohjnlf1gur128eqe@4ax.com>
In-Reply-To <rvc6p6toumdlevjb48ohjnlf1gur128eqe@4ax.com>
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 7bit
Message-ID <RYSdnRdK4sZ93Q7QnZ2dnUVZ_sSdnZ2d@posted.palinacquisition> (permalink)
Lines 19
X-Usenet-Provider http://www.giganews.com
NNTP-Posting-Host 50.46.118.188
X-Trace sv3-Y5N+4dEf2C7z3vyN93L3YEBpop5SR0Sd7ccyFMpVIw1ymDoDOgoyIB+ecBpH4Euui+juTHCi+BIkFN6!hGHPmsPL4z3niozRBwyl+qWvEKYsvrAwoELedMXXJqP322xaaJTeqqPtFwTBfbCDa9qnLuh0ye6e!4FO8CbFFrZfxl+nAX6HD2ciz4VEkTaV96XxAxzlXMNU=
X-Complaints-To abuse@iinet.com
X-DMCA-Complaints-To abuse@iinet.com
X-Abuse-and-DMCA-Info Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info Otherwise we will be unable to process your complaint properly
X-Postfilter 1.3.40
X-Original-Bytes 2191
Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.stben.net!border3.nntp.ams.giganews.com!border1.nntp.ams.giganews.com!border4.nntp.ams.giganews.com!border2.nntp.ams.giganews.com!feeder1.cambriumusenet.nl!feed.tweaknews.nl!postnews.google.com!news2.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.posted.palinacquisition!news.posted.palinacquisition.POSTED!not-for-mail
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:2592

Show key headers only | View raw


On 3/30/11 6:51 AM, Roedy Green wrote:
> I am working on a screenscraping project that is turning out to much
> more time-consuming that I thought it would be. I am trying to gather
> a database of information about all the motherboards sold my major
> manufacturers.  The idea is to eventually create a comparison shopper
> to help you narrow down models that fit your needs. [...]

Already done.  For example:
http://www.newegg.com/Store/SubCategory.aspx?SubCategory=22

The most successful approach will never be to scrape web sites, but 
rather to do what commercial sites do: build a database from technical 
specifications provided by manufacturers (as above).

If you're doing this as an academic exercise, then I suppose that might 
be useful from a learning point of view.  You'll learn a lot about 
scraping. :)  Otherwise, I don't see the point.

Pete

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 06:51 -0700
  Re: JavaScript and Screenscraping Michal Kleczek <kleku75@gmail.com> - 2011-03-30 16:27 +0200
    Re: JavaScript and Screenscraping Tom Anderson <twic@urchin.earth.li> - 2011-03-31 00:28 +0100
  Re: JavaScript and Screenscraping Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-03-30 07:40 -0700
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-03-30 18:27 -0700
  Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-01 23:39 +0100
    Re: JavaScript and Screenscraping Roedy Green <see_website@mindprod.com.invalid> - 2011-04-01 20:00 -0700
      Re: JavaScript and Screenscraping Dr J R Stockton <reply1113@merlyn.demon.co.uk> - 2011-04-03 17:27 +0100
  Re: JavaScript and Screenscraping RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-04-05 17:15 +0100

csiph-web