Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
| Message-ID | <67d9f16a@news.ausics.net> (permalink) |
|---|---|
| From | not@telling.you.invalid (Computer Nerd Kev) |
| Subject | Re: bad bot behavior |
| Newsgroups | comp.misc |
| References | <vrc2r4$2okrp$1@dont-email.me> <vrc8qm$2tkq5$1@dont-email.me> |
| Date | 2025-03-19 08:19 +1000 |
| Organization | Ausics - https://newsgroups.ausics.net |
D Finnigan <dog_cow@macgui.com> wrote: > On 3/18/25 10:17 AM, Ben Collver wrote: >> Please stop externalizing your costs directly into my face >> ========================================================== >> March 17, 2025 on Drew DeVault's blog >> >> Over the past few months, instead of working on our priorities at >> SourceHut, I have spent anywhere from 20-100% of my time in any given >> week mitigating hyper-aggressive LLM crawlers at scale. > > This is happening at my little web site, and if you have a web site, > it's happening to you too. Don't be a victim. Meh, my little Web site runs so light that even when Amazon's bot got stuck in a recursive loop grabbing the same dynamic page tens of times a second from different IPs, the server load was near nill as usual. The main problem that caused was access logs of hundreds of megabytes per day. Amazon is still scraping the hell out of everything I put online (even a mirror that's tens of GBs), and other bots squeeze into the logs too, maybe even a few humans view things sometimes? I don't care, they're welcome to it, and they helped me find the bug in the Apache configuration which allowed that recursive loop (though I still don't get why bots started forming such URLs in the first place). -- __ __ #_ < |\| |< _#
Back to comp.misc | Previous | Next — Previous in thread | Next in thread | Find similar
bad bot behavior Ben Collver <bencollver@tilde.pink> - 2025-03-18 15:17 +0000
Re: bad bot behavior D Finnigan <dog_cow@macgui.com> - 2025-03-18 12:00 -0500
Re: bad bot behavior not@telling.you.invalid (Computer Nerd Kev) - 2025-03-19 08:19 +1000
Re: bad bot behavior Toaster <toaster@dne3.net> - 2025-03-18 18:20 -0400
Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-19 12:06 +0000
Re: bad bot behavior Rich <rich@example.invalid> - 2025-03-19 16:59 +0000
Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-23 14:30 +0000
Re: bad bot behavior Lawrence D'Oliveiro <ldo@nz.invalid> - 2025-03-20 02:22 +0000
Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-20 08:33 +0000
Re: bad bot behavior Toaster <toaster@dne3.net> - 2025-03-20 19:01 -0400
Re: bad bot behavior Lawrence D'Oliveiro <ldo@nz.invalid> - 2025-03-21 08:05 +0000
Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-21 08:42 +0000
Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-23 14:30 +0000
Re: bad bot behavior D Finnigan <dog_cow@macgui.com> - 2025-03-26 08:38 -0500
Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-26 17:00 +0000
Re: bad bot behavior not@telling.you.invalid (Computer Nerd Kev) - 2025-03-27 07:55 +1000
Re: bad bot behavior anthk <anthk@openbsd.home> - 2025-05-12 06:24 +0000
Re: bad bot behavior anthk <anthk@openbsd.home> - 2025-05-12 06:24 +0000
csiph-web