Could someone more technically adept than I am (eg. @manton or @sod or… well, pretty much anyone here I’m guessing) glance at this and explain the necessary steps? My understanding is that it should be possible to add something to my site’s robots.txt that will exclude it from future scraping by OpenAI / ChatGPT. Is it as simple as adding the following somewhere?:
User-agent: ChatGPT-User
Disallow: /
My robots.txt already reads as follow — does this cover me?:
Your second example prevents all user-agents (crawlers) from indexing your site. So, in theory, that should be enough as long as OpenAI respects the robots.txt standard. It’s not clear to me if they do, so to be on the safe side, you might want to add ChatGPT-User and GPTBot explicitly: