Hi, I’m building a personal website and I don’t want it to be used to train AI. In my robots.txt file I blocked:

  • ChatGPT-User
  • GPTBot
  • Google-Extended
  • FacebookBot

What bots should I also add? Are there any other ways to block AI bots?

IMPORTANT: I don’t want to block search engine crawlers, only bots that are used to train AI.

  • Oliver Lowe
    link
    58 months ago

    Maybe there’s some IP address ranges to try block?

    It’s difficult because, for example, blocking the addresses OpenAI’s crawlers use may inadvertently block addresses from Azure used by Bing or whatever.