Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #8439

Re: Phht! on screenscaping

From Arved Sandstrom <asandstrom3minus1@eastlink.ca>
Newsgroups comp.lang.java.programmer
Subject Re: Phht! on screenscaping
References <kjpb87ppeu4etii296ulk595m26poim048@4ax.com> <Oouhq.716$jh2.114@newsfe19.iad> <4e8676f3$0$291$14726298@news.sunsite.dk>
Message-ID <0qDhq.1519$jh2.616@newsfe19.iad> (permalink)
Organization Public Usenet Newsgroup Access
Date 2011-10-01 09:09 -0300

Show all headers | View raw


On 11-09-30 11:11 PM, Arne Vajhøj wrote:
> On 9/30/2011 9:53 PM, Arved Sandstrom wrote:
>> You're actually better off screenscraping. I definitely don't see how
>> this would be more work than dealing with thousands of different APIs.
> 
> I can see two advantages of API over screen scraping for the consuming
> side of the service:
> * more robust in regard to handling unusual data
> * easier to see what to change when a new version comes out (they
>   may even announce changes to an API in advance)
> 
> Arne
> 
Well, according to Roedy he's got a not overly-complicated-sounding
screenscraping algorithm that works for roughly 20 bookstore websites,
and there's no reason to believe that if he added another 20 sites to
the list that the algorithm would change substantially. Unless all of
the bookstores, that he is interested in, offered the same useful API,
he'd still have to have the screenscraping code handy.

Besides, assuming it was legal, *Roedy* could offer the API as a
service. He's the aggregating screenscraper, does all the heavy-lifting,
and other people can query *his* web service.

AHS
-- 
I tend to watch a little TV... Court TV, once in a while. Some of the
cases I get interested in.
-- O. J. Simpson

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-09-30 09:04 -0700
  Re: Phht! on screenscaping markspace <-@.> - 2011-09-30 10:10 -0700
  Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-30 10:24 -0700
    Re: Phht! on screenscaping markspace <-@.> - 2011-09-30 10:30 -0700
      Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-30 10:40 -0700
      Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 21:19 -0400
  Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 21:21 -0400
  Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-09-30 22:53 -0300
    Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 22:11 -0400
      Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-10-01 09:09 -0300
        Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-10-01 15:48 -0400
        Re: Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-10-01 19:22 -0700
          Re: Phht! on screenscaping Movable Hype <mhype101@snortwad.net> - 2011-10-02 03:40 +0000
          Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-10-02 10:20 -0300
            Re: Phht! on screenscaping Lew <lewbloch@gmail.com> - 2011-10-02 08:43 -0700
              Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-10-02 16:22 -0700
          Re: Phht! on screenscaping Martin Gregorie <martin@address-in-sig.invalid> - 2011-10-02 12:11 +0000
    Re: Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-10-01 19:03 -0700

csiph-web