Should we block OpenAI from scrapping the server?
Should we block OpenAI from scrapping the server?
platform.openai.com /docs/gptbot
Right now, robots.txt on lemmy.ca is configured this way
User-Agent: *
Disallow: /login
Disallow: /login_reset
Disallow: /settings
Disallow: /create_community
Disallow: /create_post
Disallow: /create_private_message
Disallow: /inbox
Disallow: /setup
Disallow: /admin
Disallow: /password_change
Disallow: /search/
Disallow: /modlog
Would it be a good idea privacy-wise to deny GPTBot from scrapping content from the server?
User-agent: GPTBot
Disallow: /
Thanks!
4 crossposts
You're viewing a single thread.
All Comments
17 comments
Yes. Ban them.
if ($http_user_agent = "GPTBot") { return 403; }
17 0 ReplyProbably want == instead else we will all be forbidden
6 0 ReplyI would have thought so too, but == failed the syntax check
2023/08/07 15:36:59 [emerg] 2315181#2315181: unexpected "==" in condition in /etc/nginx/sites-enabled/lemmy.ca.conf:50
You actually want ~ though because GPTBot is just in the user agent, it's not the full string.
3 0 ReplyStrangely,
=
works the same as==
with nginx. It's a very strange config format...https://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if
2 0 ReplyLook at me! I'm the GPTBot now!
1 0 Reply
Thanks for empowering my lazyness =)
4 0 Reply
17 comments
Scroll to top