I have 1 virtual server with around 30 websites on it (OLS) and investigated why the load is that high (permanent 0.9, most websites use redis cache).
In the logfiles i figured out that so many AI-Bots are crawling the sites permanently.
After i added a rule to fail2ban this bots are now blocked (they come with thousands of different IPs so it lasts 1 hour until the most of them are in the fail2ban list.
It would be nice to have an option in Enhance where we could block server wide parts of the user agents and also block whole countries.
The steps to block AI-Bots:
apt install fail2ban
Create a file /etc/fail2ban/filter.d/badbots.conf with that contents (her you can extend the Bots
[Definition]
failregex = ^"<HOST>" "\d+" "\S+ \S+ \S+" "\d+" "\d+" "\d+" "\S*" ".*(GPTBot|bingbot|Amazonbot|BLEXBot|MJ12bot|ClaudeBot).*"
ignoreregex =
Create a file /etc/fail2ban/jail.local with that contents (with 3600 the IPs are banned for 1 hour, will set it to 1 day soon:
[badbots]
enabled = true
port = http,https
filter = badbots
logpath = /var/local/enhance/webserver_logs/*.log
backend = polling
maxretry = 1
bantime = 3600
Then do systemctl restart fail2ban
.
After that we can watch the blocking orgy with tail -f /var/log/fail2ban.log
.
Just to clearify: only do this when you don't care, if the AI-Bots cannnot index the websites anymore. It could have an impact when people ask things in ChatGPT like "give me some websites that sell flowers in london" and the Bots don't have your hosted websites in their index. Don't know in detail what happens behind the scenes.