Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #6358

Re: Looking for Java web crawler api

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!feeds.phibee-telecom.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!news.tele.dk!feed118.news.tele.dk!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date Thu, 21 Jul 2011 17:10:38 -0400
From Arne Vajhøj <arne@vajhoej.dk>
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0
MIME-Version 1.0
Newsgroups comp.lang.java.programmer
Subject Re: Looking for Java web crawler api
References <4e1bf464$0$314$14726298@news.sunsite.dk>
In-Reply-To <4e1bf464$0$314$14726298@news.sunsite.dk>
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 7bit
Lines 15
Message-ID <4e2895d1$0$309$14726298@news.sunsite.dk> (permalink)
Organization SunSITE.dk - Supporting Open source
NNTP-Posting-Host 72.192.23.157
X-Trace news.sunsite.dk DXC=P5DMhad_nTFMG4[9BMmA=KYSB=nbEKnkK@IQ>cfbXVeFJPe3\kP5EUAKBm9cfh9BSDM2;kT<[:>[A\``R3S_F;3G[12KaeB32=E
X-Complaints-To staff@sunsite.dk
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:6358

Show key headers only | View raw


On 7/12/2011 3:14 AM, pm wrote:
> Hello, I am working on a project that requires me to do custom search on
> different websites.  I am using Java and while I can write this from
> ground up, I am looking at using existing APIs that can be used due to
> time limit.  So far I have came across Apache's HttpClient.
> 	I am wondering if there are any others that can be effective or
> give more options for web searching/scraping. I plan to create a GUI
> based application and need something quick and effective while not being
> too complex.

http://nutch.apache.org/ should contain a crawler and it comes with
a searchable database (Lucene).

Arne

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar


Thread

Looking for Java web crawler api pm <el_durango@yah00.c0m> - 2011-07-12 07:14 +0000
  Re: Looking for Java web crawler api Bent C Dalager <bcd@pvv.ntnu.no> - 2011-07-12 09:44 +0000
    Re: Looking for Java web crawler api Durango2011 <el_durango@yah00.c0m> - 2011-07-13 06:12 +0000
  Re: Looking for Java web crawler api Roedy Green <see_website@mindprod.com.invalid> - 2011-07-14 12:52 -0700
  Re: Looking for Java web crawler api iadb <freeinternetarticles@gmail.com> - 2011-07-18 16:06 -0700
  Re: Looking for Java web crawler api Durango2011 <el_durango@yah00.c0m> - 2011-07-21 05:49 +0000
  Re: Looking for Java web crawler api Arne Vajhøj <arne@vajhoej.dk> - 2011-07-21 17:10 -0400

csiph-web