this post was submitted on 21 May 2026
138 points (99.3% liked)
Fuck AI
7069 readers
1263 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Fun twist: no! There's a very neat trick you can do when you serve the crawlers poison: you can hide an identifier in the URLs you serve them, and you can then identify that id when they come back riding on the back of remote controlled chromes. By serving them garbage, you can overload their queue with poisoned ones, which helps you block crawlers that you wouldn't otherwise be able to block.
Generating and serving garbage is incredibly cheap (cheaper than serving a file from a filesystem on SSD, in most cases), and once you have requests landing on poisoned URLs, you can firewall them off for a day or so, and reduce your costs even more.
We may not be able to poison the models, but we can poison their crawling queues. I have a year's worth of data to support that. They still haven't caught on.
I admire the optimism to see it this way and not "it's still not worth it to them to bother blacklisting the domain"
I wonder too, why they didn't, because they're happily crawling domains that never had anything but junk on them. To me, that suggests they have no idea they're trapped. Not at crawling time at least.