this post was submitted on 19 Aug 2025
626 points (98.9% liked)

Technology

74193 readers
4308 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 2) 50 comments
sorted by: hot top controversial new old
[–] BaroqueInMind@piefed.social 18 points 16 hours ago

Cry more, Perplexity.

[–] Ermiar@lemmy.world 19 points 17 hours ago* (last edited 17 hours ago) (1 children)
[–] Ekybio@lemmy.world 19 points 17 hours ago (3 children)

Can someone with more knowledge shine a bit more light on this while situation? Im out of the loop on the technical details

[–] panda_abyss@lemmy.ca 32 points 17 hours ago* (last edited 17 hours ago) (6 children)

Cloudflare runs as a CDN/cache/gateway service in front of a ton of websites. Their service is to help protect against DDOS and malicious traffic.

A few weeks ago cloudflare announced they were going to block AI crawling (good, in my opinion). However they also added a paid service that these AI crawlers can use, so it actually becomes a revenue source for them.

This is a response to that from Perplexity who run an AI search company. I don’t actually know how their service works, but they were specifically called out in the announcement and Cloudflare accused them of “stealth scraping” and ignoring robots.txt and other things.

[–] _cryptagion@lemmy.dbzer0.com 10 points 15 hours ago

It should be pointed out that Cloudflare didn't say they were going to block AI traffic, they give you the option to. The service is a free opt-in for people who want it.

[–] nutsack@lemmy.dbzer0.com 6 points 15 hours ago* (last edited 15 hours ago)

they don't outright block ai crawlers. they added some new tools and options for managing or blocking ai bot traffic which the cloudflare customer can choose to use or to not use.

im running a free educational resource and i let the crawlers hit my site all they want because its useful knowledge unavailable anywhere else and it's served to them from cloudflare's free tier cache. i just don't know why they have to read it ten thousand times a day.

load more comments (4 replies)
[–] BetaDoggo_@lemmy.world 21 points 17 hours ago* (last edited 17 hours ago) (1 children)

Perplexity (an "AI search engine" company with 500 million in funding) can't bypass cloudflare's anti-bot checks. For each search Perplexity scrapes the top results and summarizes them for the user. Cloudflare intentionally blocks perplexity's scrapers because they ignore robots.txt and mimic real users to get around cloudflare's blocking features. Perplexity argues that their scraping is acceptable because it's user initiated.

Personally I think cloudflare is in the right here. The scraped sites get 0 revenue from Perplexity searches (unless the user decides to go through the sources section and click the links) and Perplexity's scraping is unnecessarily traffic intensive since they don't cache the scraped data.

[–] lividweasel@lemmy.world 6 points 14 hours ago (2 children)

…and Perplexity's scraping is unnecessarily traffic intensive since they don't cache the scraped data.

That seems almost maliciously stupid. We need to train a new model. Hey, where’d the data go? Oh well, let’s just go scrape it all again. Wait, did we already scrape this site? No idea, let’s scrape it again just to be sure.

load more comments (2 replies)
load more comments (1 replies)
[–] EncryptKeeper@lemmy.world 8 points 15 hours ago (2 children)

I can’t get over their CEO that looks like a nine year old. Not sure what it is about him

load more comments (2 replies)
[–] interdimensionalmeme@lemmy.ml 9 points 16 hours ago (6 children)
load more comments (6 replies)
load more comments
view more: ‹ prev next ›