Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3819
| From | Anthk <anthk@disroot.org> |
|---|---|
| Newsgroups | comp.lang.postscript |
| Subject | Re: archive.org index of fermilab's archive of ps docs |
| Date | 2022-12-13 01:54 +0000 |
| Organization | A noiseless patient Spider |
| Message-ID | <slrntpfljj.1198.anthk@openbsd.home.local> (permalink) |
| References | <c1804a43-da02-44d1-8c5b-cfcb522828a0n@googlegroups.com> <f4af945e-0372-4791-8279-43e7e842bcacn@googlegroups.com> |
On 2022-04-29, Ross Presser <rpresser@gmail.com> wrote: > On Monday, April 11, 2022 at 9:57:17 PM UTC-4, luser droog wrote: >> https://web.archive.org/web/*/http://www-cdf.fnal.gov/offline/PostScript/* >> >> Get 'em while they're hot. It can still be removed if fnal changes its robots >> restrictions. > > Word to the wise: > If the "MIME TYPE" column on the listing page luser droog linked to has "unk", > the capture is going to be a 404 page. Mostly this appears to be when it crawled > the wrong URL, e.g. > http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf<p> > instead of > http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf > > Some of the incorrect URLs are incredibly long and appear to be multiple URLs. > Try splitting into separate URLs. > > Sometimes the corrected URL is also captured; download that instead. Sorry if I'm late, but here's a better URL. http://theoldnet.com/get?url=http%3A%2F%2Fwww-cdf.fnal.gov%2Foffline%2FPostScript&year=2010&scripts=false&decode=false
Back to comp.lang.postscript | Previous | Next — Previous in thread | Find similar
archive.org index of fermilab's archive of ps docs luser droog <luser.droog@gmail.com> - 2022-04-11 18:57 -0700
Re: archive.org index of fermilab's archive of ps docs Ross Presser <rpresser@gmail.com> - 2022-04-29 07:00 -0700
Re: archive.org index of fermilab's archive of ps docs Anthk <anthk@disroot.org> - 2022-12-13 01:54 +0000
csiph-web