702
The AI company Perplexity is complaining their bots can't bypass Cloudflare's firewall
(www.searchenginejournal.com)
This is a most excellent place for technology news and articles.
That logic would not extend to ad blockers, as the point of concern is gaining unauthorized access to a computer system or asset. Blocking ads would not be considered gaining unauthorized access to anything. In fact it would be the opposite of that.
And my point is that defining "unauthorized" to include visitors using unauthorized tools/methods to access a publicly visible resource would be a policy disaster.
If I put a banner on my site that says "by visiting my site you agree not to modify the scripts or ads displayed on the site," does that make my visit with an ad blocker "unauthorized" under the CFAA? I think the answer should obviously be "no," and that the way to define "authorization" is whether the website puts up some kind of login/authentication mechanism to block or allow specific users, not to put a simple request to the visiting public to please respect the rules of the site.
To me, a robots.txt is more like a friendly request to unauthenticated visitors than it is a technical implementation of some kind of authentication mechanism.
Scraping isn't hacking. I agree with the Third Circuit and the EFF: If the website owner makes a resource available to visitors without authentication, then accessing those resources isn't a crime, even if the website owner didn't intend for site visitors to use that specific method.
When sites put challenges like Anubis or other measures to authenticate that the viewer isn't a robot, and scrapers then employ measures to thwart that authentication (via spoofing or other means) I think that's a reasonable violation of the CFAA in spirit — especially since these mass scraping activities are getting attention for the damage they are causing to site operators (another factor in the CFAA, and one that would promote this to felony activity.)
The fact is these laws are already on the books, we may as well utilize them to shut down this objectively harmful activity AI scrapers are doing.
That same logic is how Aaron Swartz was cornered into suicide for scraping JSTOR, something widely agreed to be a bad idea by a wide range of lawspeople including SCOTUS in its 2021 decision Van Buren v. US that struck this interpretation off the books.