I've caught some AI crawlers aggressively crawling some of my sites, disregarding the robots.txt. Some of the sites are of little or no interest to any real person. So I've deployed iocaine (iocaine.madhouse-project.org/) on them, in a "always spew nonsense" mode, rather than the suggested "generate nonsense if it looks like a bot" mode. But I am not unfair. I've included a robots.txt there so that any bot that respects it will be spared from ingesting it.
Unfortunately, I'm abandoning my self-hosted git repositories, but I didn't have much on there of interest any more. Most of what was there was old, and was also in my GitLab account.