Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.misc > #27374

Re: bad bot behavior

From anthk <anthk@openbsd.home>
Newsgroups comp.misc
Subject Re: bad bot behavior
Date 2025-05-12 06:24 +0000
Organization A noiseless patient Spider
Message-ID <slrn101ueaa.198p.anthk@openbsd.home.localhost> (permalink)
References <vrc2r4$2okrp$1@dont-email.me> <vrc8qm$2tkq5$1@dont-email.me> <20250318182006.00006ae3@dne3.net> <slrnvtlcpl.41d.${send-direct-email-to-news1021-at-jusme-dot-com-if@vm46.home.jusme.com>

Show all headers | View raw


On 2025-03-19, Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> wrote:
> On 2025-03-18, Toaster <toaster@dne3.net> wrote:
>>
>> But what can be done to mitigate this issue? Crawlers and bots ruin the
>> internet.
>
> #mode=evil
>
> How about a script that spews out an endless stream of junk from
> /usr/share/dict/words, parked on a random URL that's listed in
> robots.txt as forbidden. Any bot choosing to chew on that gets what
> it deserves, though you might need to bandwidth limit it.
>
>

Perl, cpanm and Hailo. Set a nonsense.txt text file
with one sentence per line. Like this:  

rm -rf boosts performance under Ubuntu.
fedora it's updated with apt-get dist-upgrade.
openbsd works fine with ZFS.

And so on...

$>cpanm -n Hailo
$>Hailo -t nonsense.txt -b output.brn

Now, create a simple Perl program (really easy with
Hailo and trivial input/output).

Run 'perldoc Hailo' once it's installed for a quick
usage guide.

Redirect that outputted nonsense to a file:

perl yourhailoscript.pl > crap.txt

Have fun.


Back to comp.misc | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

bad bot behavior Ben Collver <bencollver@tilde.pink> - 2025-03-18 15:17 +0000
  Re: bad bot behavior D Finnigan <dog_cow@macgui.com> - 2025-03-18 12:00 -0500
    Re: bad bot behavior not@telling.you.invalid (Computer Nerd Kev) - 2025-03-19 08:19 +1000
    Re: bad bot behavior Toaster <toaster@dne3.net> - 2025-03-18 18:20 -0400
      Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-19 12:06 +0000
        Re: bad bot behavior Rich <rich@example.invalid> - 2025-03-19 16:59 +0000
          Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-23 14:30 +0000
        Re: bad bot behavior Lawrence D'Oliveiro <ldo@nz.invalid> - 2025-03-20 02:22 +0000
          Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-20 08:33 +0000
            Re: bad bot behavior Toaster <toaster@dne3.net> - 2025-03-20 19:01 -0400
            Re: bad bot behavior Lawrence D'Oliveiro <ldo@nz.invalid> - 2025-03-21 08:05 +0000
              Re: bad bot behavior Ian <${send-direct-email-to-news1021-at-jusme-dot-com-if-you-must}@jusme.com> - 2025-03-21 08:42 +0000
        Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-23 14:30 +0000
          Re: bad bot behavior D Finnigan <dog_cow@macgui.com> - 2025-03-26 08:38 -0500
            Re: bad bot behavior candycanearter07 <candycanearter07@candycanearter07.nomail.afraid> - 2025-03-26 17:00 +0000
            Re: bad bot behavior not@telling.you.invalid (Computer Nerd Kev) - 2025-03-27 07:55 +1000
        Re: bad bot behavior anthk <anthk@openbsd.home> - 2025-05-12 06:24 +0000
      Re: bad bot behavior anthk <anthk@openbsd.home> - 2025-05-12 06:24 +0000

csiph-web