Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #8459

Re: Phht! on screenscaping

Path csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.glorb.com!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date Sat, 01 Oct 2011 15:48:58 -0400
From Arne Vajhøj <arne@vajhoej.dk>
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2
MIME-Version 1.0
Newsgroups comp.lang.java.programmer
Subject Re: Phht! on screenscaping
References <kjpb87ppeu4etii296ulk595m26poim048@4ax.com> <Oouhq.716$jh2.114@newsfe19.iad> <4e8676f3$0$291$14726298@news.sunsite.dk> <0qDhq.1519$jh2.616@newsfe19.iad>
In-Reply-To <0qDhq.1519$jh2.616@newsfe19.iad>
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 8bit
Lines 26
Message-ID <4e876ead$0$293$14726298@news.sunsite.dk> (permalink)
Organization SunSITE.dk - Supporting Open source
NNTP-Posting-Host 72.192.23.141
X-Trace news.sunsite.dk DXC=DHf9hjG::O]D]J=fIXJe6_YSB=nbEKnk[AYIL9Z9\Va_JPe3\kP5EUQKBm9cfh9BSTM2;kT<[:>[QL`ea\CQ6JdY<g<8ZP0fnJ\
X-Complaints-To staff@sunsite.dk
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:8459

Show key headers only | View raw


On 10/1/2011 8:09 AM, Arved Sandstrom wrote:
> On 11-09-30 11:11 PM, Arne Vajhøj wrote:
>> On 9/30/2011 9:53 PM, Arved Sandstrom wrote:
>>> You're actually better off screenscraping. I definitely don't see how
>>> this would be more work than dealing with thousands of different APIs.
>>
>> I can see two advantages of API over screen scraping for the consuming
>> side of the service:
>> * more robust in regard to handling unusual data
>> * easier to see what to change when a new version comes out (they
>>    may even announce changes to an API in advance)

> Well, according to Roedy he's got a not overly-complicated-sounding
> screenscraping algorithm that works for roughly 20 bookstore websites,
> and there's no reason to believe that if he added another 20 sites to
> the list that the algorithm would change substantially. Unless all of
> the bookstores, that he is interested in, offered the same useful API,
> he'd still have to have the screenscraping code handy.

Until everybody does if right, then he would still need the hack.

But number of cases and changes should decrease with a smaller number
of non API sites.

Arne

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-09-30 09:04 -0700
  Re: Phht! on screenscaping markspace <-@.> - 2011-09-30 10:10 -0700
  Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-30 10:24 -0700
    Re: Phht! on screenscaping markspace <-@.> - 2011-09-30 10:30 -0700
      Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-30 10:40 -0700
      Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 21:19 -0400
  Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 21:21 -0400
  Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-09-30 22:53 -0300
    Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-09-30 22:11 -0400
      Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-10-01 09:09 -0300
        Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-10-01 15:48 -0400
        Re: Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-10-01 19:22 -0700
          Re: Phht! on screenscaping Movable Hype <mhype101@snortwad.net> - 2011-10-02 03:40 +0000
          Re: Phht! on screenscaping Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-10-02 10:20 -0300
            Re: Phht! on screenscaping Lew <lewbloch@gmail.com> - 2011-10-02 08:43 -0700
              Re: Phht! on screenscaping Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-10-02 16:22 -0700
          Re: Phht! on screenscaping Martin Gregorie <martin@address-in-sig.invalid> - 2011-10-02 12:11 +0000
    Re: Phht! on screenscaping Roedy Green <see_website@mindprod.com.invalid> - 2011-10-01 19:03 -0700
      Re: Phht! on screenscaping Arne Vajhøj <arne@vajhoej.dk> - 2011-11-06 18:06 -0500

csiph-web