Couldn’t archive on archive.today, they put up a captcha, and google one at that. That doesn’t let me through at all.

Google and OpenAI sucks:
Google’s legal theory has another significant problem: the requirement that a TPM must “effectively control” access. Just last week, a court rejected Ziff Davis’s attempt to turn robots.txt into a 1201 violation when OpenAI allegedly ignored its crawling restrictions. The court’s reasoning is directly applicable here:
OpenAI slamed my small server into the ground, until I put fail2ban on top. It was really bad, like thousands of requests per second bad.
How does fail2ban prevent scrapping? My understanding was that fail2ban works on failed login attempts.
There’s some premade scripts out there that make it do more. I have it hooked up to nginx and other such logs. Its common enough in login attempts for login portals online, not just ssh. It can work with any grep-able log file.
I just took two scripts other people have made, verified they soon my mini PC and set it loose. Within about 10 min it caught most scrappers and banned the IPs.
Fuck Google





