this post was submitted on 10 Sep 2025
129 points (96.4% liked)

Technology

74994 readers
3044 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Kissaki@feddit.org 18 points 13 hours ago

evolves robots.txt instructions by adding an automated licensing layer that's designed to block bots that don't fairly compensate creators for content

robots.txt - the well known technology to block bad-intention bots /s

What's automated about the licensing layer? At some point, I started skimming the article. They didn't seem clear about it. The AI can "automatically" parse it?

# NOTICE: all crawlers and bots are strictly prohibited from using this 
# content for AI training without complying with the terms of the RSL 
# Collective AI royalty license. Any use of this content for AI training 
# without a license is a violation of our intellectual property rights.

License: https://rslcollective.org/royalty.xml

Yeah, this is as useless as I thought it would be. Nothing here is actively blocking.

I love that the XML then points to a text/html content website. I guess nothing for machine parsing, maybe for AI parsing.

I don't remember which AI company, but they argued they're not crawlers but agents acting on the users behalf for their specific request/action, ignoring robots.txt. Who knows how they will react. But their incentives and history is ignoring robots.txt.

Why ~~am I~~ is this comment so negative. Oh well.