AI as a tool for unleashing the devastating power of unregulated capitalism against its enemies, what a brilliant coup by the techbros and banksters. Not to mention the actual breakdown of community sites by AI scrapers, mostly using dirty botnet hacking techniques. This poison will eat itself one way or another. The question is: what can we do to make this happen sooner than later?
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
I don’t follow how LLMs destroy open source. For example, a LLM trained on the Linux kernel could probably be used to produce a closed source kernel with a lot of human effort. Big tech companies already make a lot of money from Linux without ever contributing back. That doesn’t change the fact that we can all run Linux and not be trapped using proprietary garbage like Windows. Community contributions still help create a rising tide that raises all boats, and shitty big tech companies having their own massive yachts raised as well doesn’t really change that fact.
I hate big tech companies and the AI grift as much as anyone else here, but don’t really follow the article’s point.
Not sure why I’m getting so many downvotes in this thread, aside from the fact that it may sound like I’m standing up for big tech, which I’m not. This article is more or less saying that open source is doomed as a result of big tech’s LLMs, and I’m saying it’s AI that is ultimately doomed and open source will be just fine. AI isn’t going to make it any easier to replicate the open source projects used to train it, well, for the same reason is doomed to fail: AI is based on exaggerated claims. No, companies aren’t going to use AI to make their own Linux kernel not bound by GPL licensing terms. What’s going to happen is the commercial AI bubble is going to pop, perhaps leaving behind open source AI models that will be used for the modest value they bring for certain tasks.
Essentially almost all FOSS software is under an OSS license of some sort, which allows anyone to re-use their code or software as long as what re-uses it also remains free and open source or at least having at least as open/permissive of a license policy as the original work/code.
LLMs ignore that, hide it behind a subscription, and use it to train their models for selling to soulless corporate entities who will never ever allow their code to be in the FOSS world, thus, breaking the contract.
It's not even an implicit contract, it's explicit, and LLM companies are ignoring this and using their investment to squash any FOSS projects that want to challenge them in court on it.
Given that LLMs increase productivity in the aggregate by 15-20%, and sticking with Linux as an example, a LLM trained on the Linux kernel could be used to make a similar kernel with a ton of human effort. That company could then make a proprietary OS and sell it. Other companies then have the choice of using open source Linux, devoting a ton of their own resources to making a proprietary OS with a little help from AI, or licensing the other company’s proprietary OS. Everyone else can still use Linux and not care.
It’s possible I’m using the wrong example or overlooking something that would help me better understand this perspective.
There is absolutely no way you're using an LLM to rewrite the Linux kernel in any way. That's not what they do, and whatever it produces wouldn't be even a fraction of effective as the current kernel.
They're text prediction machines. That's it. Markov generators on steroids.
I'd also be curious about where that 15-20% productivity increase comes from in aggregate. That's an extremely misleading statistic. The truth is there are no consensus data on any productivity improvements with LLMs today in aggregate. Anything anyone has is made up. It's also not taking into account the additional bugs and issues caused by LLMs, which are significant, and also not a thing you want to have happening on every PR with kernel code, I promise.
Regardless of all of that, the companies with these LLMs are using free software to train their models to make money without making their models free and open source or providing a way for people to use it for free/open source projects, so this is a clear violation of every single FOSS license model I'm familiar with (most commonly used is the Apache one).
TL;DR; they are stealing code meant to be free and public with any derivative works, profiting off it, and then refusing to honor the license model of the code/project they stole.
This is illegal. The only reason why we're not seeing a lot about it is these FOSS generally have no money and are not going to sue them and potentially lose a substantial sum of their negligible funds in court. That's it. Otherwise, what they are doing is very illegal. The sort of thing any professional software development company you work for's legal team warns you about the second you start using an OSS project in your for profit business application codebase.
LLMs get away with it because $$$$$$$$$$$$$$$$$. That's it.
Edit: added link to security article with LLMs
I'd also be curious about where that 15-20% productivity increase comes from in aggregate.
This is from a Stanford study that is summarized here:
There are other studies with different conclusions, but this one aligns with my own experience. To your point about how AI won’t reproduce the Linux kernel, this study also points out that AI is significantly less effective, even going into the negative, with complex codebases, which is in agreement with what you said, since the Linux kernel certainly qualifies as a complex codebase.
they are stealing code meant to be free and public with any derivative works, profiting off it, and then refusing to honor the license model of the code/project they stole.
I agree big tech is using open source unethically, but how much different is this situation from the other ways big tech profits from open source without contributing back? Training proprietary LLMs on open source code is shitty, rent-seeking behavior, but not really a unique development, and certainly not something that undermines the core value of open source.
Training proprietary LLMs on open source code is shitty, rent-seeking behavior, but not really a unique development, and certainly not something that undermines the core value of open source.
Destroying "share alike" doesn't undermine the core value of open source? What IS the core value?
The LLMs are not distributing the GPL code, their weights are being trained on it. You can’t just have Copilot pump out something that works like the Linux kerne or Blender, except with different code that isn’t subject to the GPL license. At best, the AI can learn from it and assist humans with developing a proprietary alternative. In that case, it’s not really that much better than having humans study a GPL codebase and make a proprietary alternative without AI. It’s still going to cost a lot of money to replicate the thing no matter what, so why not just save money and use the GPL code and contribute back? Also, it’s going to be hard to sell your proprietary alternative, because why wouldn’t people just use the FOSS version?
You can't "train" on code you haven't copied. That is kind of obvious, right? So did they have the right to copy and then reproduce the work without attribution?
Yeah, I guess this is a bit of gray area. With GPL, you only have rights to code if it was distributed to you. In the case of GPL code that has only been distributed to select people and none of those people distributed it to the general public, but GitHub still trained their models on the private repo, then that would technically be in violation of the license. This would be a more niche scenario, though, since the intent normally is public distribution.
As i understand, Linux is under a license (GPL) which explicitly prevents closing the reuse of its code. So it's all about legal interpretation of "reuse" and so far the GPL stands up against abuses. I suppose that a company specifically targeting the source code for the intend of creating another OS might need to hire more lawyers than developers with far from certain results. But who knows, in a world where billions $ are free as soon as "AI" is mentioned in a business plan.
There are upsides.
Software freedom is usually associated with FOSS (legal and public exchange), but there's also scene (underground exchange based on personal connections).
The latter, of course, is not quite the heaven many people have learned to believe in, with everything being a public verified project with all the source code visible and legal to use for every purpose.
But the latter also has advantages, it's a non-neutered culture with all the old anarchist and hacker substrate.
Any heaven offered is usually a trap anyway.
I wonder if the whole purpose of promotion of FOSS by big companies was, long-term, this. Finding some way to abuse openness and collect for free the resource that becomes digital oil in the next stage, but only for those who own the foundries - computing resources for ML, that is.
I don't see the point of romanticizing the scene as preserving some "pure" hacker ethos and conflating it with FOSS.
I'd rather use some free and open source software I can audit and trust rather than some pirated shit some company built.
FOSS creates sustainable value. Companies can build businesses around FOSS through services, support, hosting, and custom development. The scene creates nothing, they don't promote standards, don't think of interoperability and so on.
The internet and the very service you're using run on open source software. The people that build them have values and I don't think at any point they thought of creating something for LLMs to train on - that's like the dumbest conspiracy theory I've read since a long time and it doesn't even make sense timeline-wise.
The original FOSS licenses were designed to restrict corporate exploitation, not enable it (even if you have some more permissive licenses that make more sense to be used in a enterprise context), but it was promoted because it worked better and created value.
Would you say the same thing to an artist that freely shared his art and see his work copied in the output of some generative ai tool? That would be victim-blaming
I don’t see the point of romanticizing the scene as preserving some “pure” hacker ethos and conflating it with FOSS.
No, but a bit more culturally mature in the sense of diversity of philosophy.
FOSS creates sustainable value. Companies can build businesses around FOSS through services, support, hosting, and custom development. The scene creates nothing, they don’t promote standards, don’t think of interoperability and so on.
So, if you just change the mood in these few sentences, you'll get what I'm trying to say.
The internet and the very service you’re using run on open source software. The people that build them have values and I don’t think at any point they thought of creating something for LLMs to train on - that’s like the dumbest conspiracy theory I’ve read since a long time and it doesn’t even make sense timeline-wise.
You don't think? I might have encountered some people you'd expect to be good. They are really not that. Let's not conflate having values with having made contributions.
The original FOSS licenses were designed to restrict corporate exploitation, not enable it (even if you have some more permissive licenses that make more sense to be used in a enterprise context), but it was promoted because it worked better and created value.
Designed to do that at the expense of being constrained by law and public morality.
Would you say the same thing to an artist that freely shared his art and see his work copied in the output of some generative ai tool? That would be victim-blaming
Life is complex.
a bit more culturally mature in the sense of diversity of philosophy.
More culturally mature in which ways? Very curious to read anything about it.
Let's not conflate having values with having made contributions.
Yes sure, but a contribution is already a statement in itself. I don't mind if the person is "not good". I'd be tempted to answer you by quoting you (without attempting to make it cryptic or cynical): life is indeed complex. There's like an infinity of viewpoints on why people contribute to foss, but I think if people do, it's because they're getting value out of it, and as a result, the whole community does. Most foss contributors mind that.
Now if you keep alluding to deeper points without actually making them, I don't see what I'd gain by continuing this conversation.
More culturally mature in which ways? Very curious to read anything about it.
I think I've already said that.
Say, if someone is a very good programmer, that doesn't mean they are better than a random drunk on any other subject.
But in FOSS they usually assume otherwise.
OK, it's not scene being more mature than FOSS, it's scene being normal and FOSS being less mature than in general.
There’s like an infinity of viewpoints on why people contribute to foss, but I think if people do, it’s because they’re getting value out of it, and as a result, the whole community does. Most foss contributors mind that.
Yes, well, that objective value direction is too a limitation. I've been reading one good book recently, still under impression (and probably will be for much longer). There are no good architects without bad architects, no good poetry without bad poetry, and no good contributions without bad contributions. And about usefulness for the whole community - a good system serves each and every use, not the majority use.
Similar to inclusiveness, except it's ideological and not racial\medical.
In FOSS even something like PulseAudio or SystemD is spread by pressure. No, it really doesn't matter which advantages they have in someone's system of values or in all systems of values possible to describe. Only the pressure matters while it shouldn't be there.
I wonder if the whole purpose of promotion of FOSS by big companies was, long-term, this. Finding some way to abuse openness and collect for free the resource that becomes digital oil in the next stage, but only for those who own the foundries - computing resources for ML, that is.
Even if it wasn't, it seems that they are perfectly fine with it now.
I mean Apple and Microsoft essentially built their empires on the backs of Open Source developers who believed in a free internet. They took openly available code, altered it and put a price tag on it. Software development and by extend the internet was stolen from the public by the likes of Steve Jobs and Bill Gates.
I think it was, almost since mid-nineties. It's very notable how the whole initial visibility of FOSS came from universities and companies. Before that FOSS projects were not particularly visible compared to the scene in its various forms. (I was born in 1996, so talking about what I didn't see.)
GNU, for comparison, was considered that strange group of hackers somewhere out there.
I think it's when in popular culture hackers became some sort of anarchist heroes, - from movies to Star Wars EU etc, - then that culture also became something that had to be dealt with. Doesn't even matter if it really had such potential.
The threat was that personal computing and the scene combined are similar to the printing press, but multi-dimensional, - software, music, other art, exchange of it, - and the solution was to find the least potent branch. The branch that only aimed for exchange of gifts, public and legal and with no ideology attached (except for quasi-leftist activism somewhere around, but not too thick). And the branch that had the least amount of decentralization, obscurity and invisibility.
As a vaccine.
Can you more succinctly express your point, it got a bit muddy at the end. Are you saying they stole the least potent bit? And if you have the spoons could you elaborate?
Not "stole", rather supported. Like authoritarian governments might support the least potent youth political group of those existing, as a spoiler.
There's pluralism of respect and values, one might notice that FOSS doesn't really have much of that. It's pretty authoritarian. Just people think it's meritocracy and shouldn't be otherwise.
The longer I live, the more I think today's tech is a dead end.