sudo

joined 2 years ago
[–] sudo@programming.dev 7 points 22 hours ago

Not much for open source solutions. A simple captcha however would cost scrapers more to crack than Anubis.

But when it comes to "real" bot management solutions: The least invasive solutions will try to match User-Agent and other headers against the TLS fingerprint and block if they don't match. More invasive solutions will fingerprint your browser and even your GPU, then either block you or issue you a tracking cookie which is often pinned to your IP and user-agent. Both of those solutions require a large base of data to know what real and fake traffic actually looks like. Only large hosting providers like CloudFlare and Akamai have that data and can provide those sorts of solutions.

[–] sudo@programming.dev 8 points 22 hours ago* (last edited 22 hours ago) (1 children)

Costs of solving PoW for Anubis is absolutely not a factor in any AI companies budget. Just the costs of answering one question is millions of times more expensive than running sha256sum for Anubis.

Just in case you're being glib and mean the businesses will go under regardless of Anubis: most of these are coming from China. China absolutely will keep running these companies at a loss for the sake of strategic development.

[–] sudo@programming.dev 3 points 1 day ago

Places like cloudflare and akamai are already using machine learning algorithms to detect bot traffic at a network level. You need to use similar machine learning to evade them. And since most of these scrapers are for AI companies I'd expect a lot of the scrapers to be LLM generated.

[–] sudo@programming.dev 8 points 1 day ago

Here's one example of a proxy provider offering to pay developers to inject their proxies into their apps. ("100% ethical proxies" because they signed a ToS). Another is BrightData proxies traffic through users of their free HolaVPN.

IOT and smart TVs are also obvious suspects.

[–] sudo@programming.dev 3 points 1 day ago

Or your TV or IOT devices. Residential proxies are extremely shady businesses.

[–] sudo@programming.dev 16 points 1 day ago* (last edited 1 day ago) (2 children)

The problem is primarily the resource drain on the server and tarpitting tactics usually increase that resource burden by maintaining the open connections.

[–] sudo@programming.dev 33 points 1 day ago* (last edited 1 day ago) (6 children)

This is what I've kept saying about POW being a shit bot management tactic. Its a flat tax across all users, real or fake. The fake users are making money to access your site and will just eat the added expense. You can raise the tax to cost more than what your data is worth to them, but that also affects your real users. Nothing about Anubis even attempts to differentiate between bots and real users.

If the bots take the time, they can set up a pipeline to solve Anubis tokens outside of the browser more efficiently than real users.