Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #6150
| Path | csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!xlned.com!feeder7.xlned.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!news.tele.dk!news.tele.dk!small.news.tele.dk!feed118.news.tele.dk!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail |
|---|---|
| From | Durango2011 <el_durango@yah00.c0m> |
| Subject | Re: Looking for Java web crawler api |
| Newsgroups | comp.lang.java.programmer |
| References | <4e1bf464$0$314$14726298@news.sunsite.dk> <slrnj1o5s6.e9i.bcd@microbel.pvv.ntnu.no> |
| User-Agent | Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT 30dc37b master) |
| MIME-Version | 1.0 |
| Content-Type | text/plain; charset=UTF-8 |
| Content-Transfer-Encoding | 8bit |
| Date | 13 Jul 2011 06:12:20 GMT |
| Lines | 11 |
| Message-ID | <4e1d3744$0$312$14726298@news.sunsite.dk> (permalink) |
| Organization | SunSITE.dk - Supporting Open source |
| NNTP-Posting-Host | 66.229.12.209 |
| X-Trace | news.sunsite.dk DXC=TX]Wo<9la@1RV[Kf06ReB9YSB=nbEKnk;0F39eVnbh>2;L11_OIPWI9F9e3NA7Te_65[f9[bmol[?PP@?gfKK362]^<>^jnVP>7 |
| X-Complaints-To | staff@sunsite.dk |
| Xref | x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:6150 |
Show key headers only | View raw
On Tue, 12 Jul 2011 09:44:38 +0000, Bent C Dalager wrote: > I found JSoup (jsoup.org) to be a fine library for web scraping. It lets > you easily set cookies and headers, fetches the URL for you, and > converts the tangled mess of HTML you tend to receive into a well-formed > XML document model. > > Cheers, > Bent D. Thank you very much that looks like what I am looking for.
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
Looking for Java web crawler api pm <el_durango@yah00.c0m> - 2011-07-12 07:14 +0000
Re: Looking for Java web crawler api Bent C Dalager <bcd@pvv.ntnu.no> - 2011-07-12 09:44 +0000
Re: Looking for Java web crawler api Durango2011 <el_durango@yah00.c0m> - 2011-07-13 06:12 +0000
Re: Looking for Java web crawler api Roedy Green <see_website@mindprod.com.invalid> - 2011-07-14 12:52 -0700
Re: Looking for Java web crawler api iadb <freeinternetarticles@gmail.com> - 2011-07-18 16:06 -0700
Re: Looking for Java web crawler api Durango2011 <el_durango@yah00.c0m> - 2011-07-21 05:49 +0000
Re: Looking for Java web crawler api Arne Vajhøj <arne@vajhoej.dk> - 2011-07-21 17:10 -0400
csiph-web