this post was submitted on 07 Mar 2026
845 points (99.1% liked)
Technology
82494 readers
4488 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The bubble popping seems inevitable at this point. Before the Giants were funding this by their core business plus loans backed by their core business. Now they've stretched their credit so much that no one's giving them loans anymore and instead of cutting back on the building spree they're making cuts to their core business.
They're betting that their customers are so locked in that they won't leave despite degradation in service. How deep oracle, AWS, googles hooks are in people remain to be seen, people seem to tolerate a lot of enshitification, but there's gotta be a tipping point. Once they reach that and the core business crashes all the rest of the dominos will fall.
that is why they are trying to peddle this to governments in EU, USA so heavily, they know they will take on AI at face value, instead of testing the efficacy of using AI.
Great timing then, just as the states becomes a global pariah making every one else on earth have to reevaluate any business done with american based firms. Nations are worried about massive instability and war, no one has the appetite to gamble big on unproven tech dreams.
Once these companies have to start charging what it really costs to maintain and run these huge models. The number of use cases will shrivel.
Models are becoming more optimized. I've recently tried LFM2.5, small version, and it's ridiculously close in usefulness to Qwen3.5, for example. Or RNJ-1.
To maintain, meaning actualized datasets - well, sort of expensive, but they were assembling those as a side effect of their main businesses.
So this is not what'll kill them. Their size will. These are very big companies with lots of internal corruption and inefficiency pulling them down. And a few new AI companies, which, I think, are going to survive, they are centered around specific products, some will die, but I'd expect LiquidAI or Anthropic or such to still be around some time after the crash.
The crash might coincide with a bubble burst, but notice how this family of technologies really is delivering results. Instead of a bunch of specialized applications people are asking LLMs and getting often good enough answers. LLM agents can retrieve data from web services, perform operations, assist in using tools.
You shouldn't look at the big ones in the cloud, rather at what value local LLMs give you for energy spent. Right now it's not that good, but approaching good honestly. I don't feel like they've stopped becoming better. Human time is still more expensive. The tools are there, and are being improved, and the humans are slowly gaining experience in using them, and that makes them more efficient in various tasks.
It's for all kinds of reference and knowledge tools what Google was for search.
And there's one just amazing thing about these models - they are self-contained, even if some can use tools to access external sources. Our corporate overlords have been building a dependent networked world for 20 years, simply to break it by popularizing a technology that almost neuters that. They were thinking, probably, that they were reaping the crops of the web for themselves, instead they taught everyone that you don't have to eat at the diner, you can take the food home.
Only people who know very little about a field feel like AI "is good enough" for that field. Experts in a field will universally say that AI is shit in their field.
LLMs are the extreme example of "the dumb man's idea of a smart man." It sounds like it knows what it's talking about so people ignorant on the subject don't know it's full of shit.
I agree with you and I consider it similar to the 'hollywood effect': Ask any expert to review typical depictions of their expertise in film and tv and they will mostly groan at the inaccuracies that most people won't catch.
Problem is that if you compare the works that do it 'right' to the ones that do it 'wrong', there's no correlation between doing it right and being more popular, the horribly wrong depictions get plenty of ratings regardless.
Now one might reasonably argue 'sure, but that's purely fiction anyway, if it had real consequences, that would actually matter', except it constantly happens in real world situations.
My work colleague picked up his car from some mechanic chain after having it 'fixed' and took us to lunch. There was just this awful squeal as he started the car and I said why is it making that noise after just getting fixed and the guy said "Oh, the staff told me that cars just sound like that after a repair until the parts break in" and that bullshit worked to get him to pay and walk out the door. I ask if I can take a quick look under his hood and there was a flashlight wedged against a belt. He just laughed it off and said "hey, free flashlight, thanks for figuring that out" and a few months later he had mentioned going back to the exact same place for something else.
A few days ago I went to a hardware store and their site said they had it, but under location it said "see associate". The first one checked his device and didn't understand what the deal was so he said "Oh, go over there and ask John, he knows all this stuff". Ok, so I walk over to John, who takes one glance and confidently says "oh yeah, that stuff is in a cage in the back row locked up, just go up to the cage and press the button to get someone to get it". I think "ok, good, a guy who really knows his stuff and the other staff recognize him for it". I roll up to the cage and look in and realize "uh oh, this is not the type of stuff I'm looking for, he made a pretty amateur mistake", but I push the button anyway. I show my phone to the guy who comes up and said that "John" said it would be here but I couldn't see it, and at the mention of "John" the guy clearly rolled his eyes and it was abundantly clear that John's "expertise" was a repeated annoyance for the guy. The actual answer is they kept that stuff in back and the employees all are supposed to see the notation in their devices telling them this, but none of them seem to figure it out and John just keeps sending people to his department instead.
This has also come out in use of AI. I offered that my group could crank out a quick tool to do something that could be a problem, and one of the people said "in this new era, we don't need you for this quick tool, I just asked Claude and it made me this application". So I tested it and reported that 'a', it didn't actually work, it produced stuff that looked right, but the actual tool wouldn't accept it because it didn't se the right syntax, and 'b', if t did work, it faked authentication and had a huge vulnerability. He just laughed it off and said 'guess LLMs sometimes aren't perfect yet', no consequences for what could have been a disastrous tool, no severe change in stance on using LLMs, and I am pretty sure the audience probably found the response about it not working to be annoyingly buzzkill and were rooting for the LLM to do all the work instead. People who need your expertise are desparate to not need your expertise anymore and are willing to believe anything to enable that, and are willing to accept a lot of badness just to not be dependent on you.
AI produce what is seen as plausible narrative, and plausible narrative can win even when the facts are against it. To be very charitable, a quick "usually" correct answer is indeed frequently "good enough" for a lot of purposes, and LLM's speed at generating output can't be beat.
The problem is that there is a lot if these people that thinks LLMs are good enough, and many of them are in decisional positions, so we're getting raked no matter what.
Bad craftsman blames his tools is what I'd answer to this.
I agree anyone using an LLM is a bad craftsman, because they're using a hammer to drive in a screw.
All LLMs are using a tool for the wrong task then, in your opinion? So in the composite object of "LLM" what is the tool and what is the task?
The tool is "Language Learning Model" and the task is "Learn language and mimic human speech."
The task is not "Provide accurate information" or "write code" or "provide legal advice" or "Diagnose these symptoms" or "provide customer service" or "manage a database".
And a human's task, along with any other lifeform, is to survive and reproduce. In pursuit of that goal we have learned many different complex strategies and methods to achieve it, same with an llm.
Peoples tasks are also not to provide accurate information, write code, provide legal advice etc. If a person can earn a living, attract a mate and raise children by lying, writing bad code, giving shitty legal advice etc. they will. It takes external discipline to make sure agents don't follow those behaviors. For humans that discipline is provided by education, socialization, legal systems etc. For LLMs that discipline is provided by fine tuning, ie. The lying models get down rated while the more truthful models get boosted.
They all "lie" because they don't actually know a damn thing. Everything an LLM outputs is just a guess of what a human might do.
An LLM has a great deal of declarative knowledge. Eg. It knows that the first president of the US is George Washington. Like humans it has built up this knowledge through reinforcement, the more a fact is reinforced by external sources, the more you/ it knows it. Like with humans when it reaches the edge of its knowledge base it will guess. If I ask someone who the 4th president of the US was they may guess Monroe, that person isn't lying, it's just an area that hasn't been reinforced (studied) as much so they are making their best guess, LLMs do the same. That doesn't mean that person cannot and will not ever know the 4th president, it just means they need more reinforcement / training / studying.
Humans as well as LLMs have a declarative knowledge area with a lot of grey area between knowing and not knowing. It'd be like a spectrum starting on one end with stuff that has been reinforced many times by people with high authority, what is your name would probably be the furthest on one side, to another end with stuff you've never heard or heard from untrustworthy sources. LLMs may not have the other dimension of trustworthiness that people do but the humans training it will usually compensate that with more repetition from trustworthy sources, eg. They'll put 10 copies of the new York times and only one of younewsnow.com or whatever in the training data.
An LLM has no knowledge.
My calculator does not "know" that 2+2=4, it runs the code it has been programmed with which tells it to output 4. It has no knowledge or understanding of what it's being asked to do, it just does what it is programmed to do.
An LLM is programmed to guess what a human would say if asked who the 4th president of the United States was. It runs the code that was developed with the training data to output the most likely response. Is it true? Doesn't matter. All that matters is that it sounds like something a human would say.
I trust the knowledge of my calculator more, because it was designed to give factual correct responses.
How do you know that George Washington is the first president? You weren't around in 1784, you have no experiential knowledge, you only have declarative knowledge of it, you read it from a book or heard it from a person enough to repeat the fact when asked. You are guessing what your history teacher would have said in elementary school. Declaritive knowledge is just memory and repetition, and an LLM can do memory and repetition.
Whether an LLM can determine truth depends on your definition of truth. If truth can only be obtained from experience and reasoning from first principles then an LLM can't determine truth. Then a statement like George Washington was the first president can't be true then because you can't derive it from experience or first principles, you weren't there, no one alive was there. George Washington was the first president derives it's validity and truth from the consensus of trustworthy people who say it's true. An LLM can derive this sort of truth by determining the consensus of its training data assuming its training data is from trustworthy sources or the more trustworthy sources are more reinforced.
Of course someone who doesn't believe "truth" exists thinks LLMs are just fine. You have to not believe things can be true in order to find their output acceptable.
Every week I see a new post of an LLM being blantly wrong. LLMs said to add glue to pizza to make the cheese stick together.
"They have improved the models since then..." Last week the American military used "AI" and it targeted a school as a military structure. The models are full of shit, they just manually remove the blantly incorrect shit whenever they make the rounds, and there's always more blantly incorrect shit to be found.
I never said I don't believe in truth, I said there are different definitions of truth and different kinds of truth, the study of this is called epistemology and I'd encourage you to look into it to better understand truth. I believe in truth derived from experience, and reasoning from first principles, 2+2=4 is true, I had coffee this morning is true. For things outside of my direct experience or that can't be reasoned I accept that truth can be derived from trustworthy external sources. Therefore Washington was the first president is true because I've heard it many times from multiple trustworthy sources.
The question is whether you believe truth can be derived from external sources or are you a Cartesian skeptic? It doesn't seem like it because that sort of worldview is very limiting. The question remains how do you know that Washington was the first president? Or even better how do you know that an LLM said to put glue on pizza? You never experienced it giving that answer, you got the idea from another source, maybe you saw a picture that could've easily been edited. The truth of that idea can only be derived from the trustworthiness of that source.
LLMs can't know everything, again they have good declaritive knowledge but they completely lack experiential knowledge and struggle with reasoning. Knowing not to put glue on pizza is knowledge gained from experience: glue tastes bad and is usually inedible, and reasoning: therefore adding glue to pizza will make it taste bad and be inedible.
Every day you also probably see a new post of humans being blatantly wrong, does that mean humans can't know things? No it just means humans have a limited area of knowledge. Same with an LLM, it can know that Washington was the first president while not knowing to not put glue on pizza, so you have to be careful what you ask it, just like when you ask human something outside their area of expertise.
No. They don't. They are good at making declarative statements.
That's not the same thing.
I fully agree that asking a random human for help with something is just as effective as asking an LLM to help with something.
If I need to know something (like who was the first president of the United States) I will not go outside and ask a random human, I will ask a trustworthy source.
If I need some code written I won't have a random human do it, I will interview people to find someone capable.
If I need someone to interact with customers I won't let some random human come in and do it.
What's the difference between making correct declaritive statements and having declaritive knowledge? If I am able to accurately state every president of the US, wouldn't you say I have knowledge of the list of US presidents? The only way you can judge my declaritive knowledge of something is by my ability to make accurate declaritive statements, that's what a test is. If making accurate declaritive statements is not the measure of declaritive knowledge then what is?
An LLM will give more accurate declaritive statements on more question then any human can, would that not mean that an LLM has more declaritive knowledge than any human? So is it not more trustworthy for giving declaritive statements than any random human? Would you not trust an LLMs answer on who the 4th president is over a random human?
Not if you include "I don't know" as an accurate statement or penalize the score for incorrect declarative statements.
I would absolutely trust the random human more because they're not going to make shit up if they don't know. It will either be "I don't know" or "I would guess" to make it clear they aren't confident. The LLM will give me a declarative answer but I have no fucking clue if it's accurate or an "hallucination" (lie). I'll need to do what I should have done in the first place and ask a search engine to make sure.
I think you are underestimating how accurate LLMs are because you probably don't use them much, and only see there mistakes posted for memes. No one's going to post the 99 times an LLM gives the correct answer, but the one time it says to put glue on pizza it's going to go viral. So if your only view on LLM output is from posts, you're going to think it's way worse than it is.
Even if you mark it down for incorrect answers it's still going to beat most people. An LLM can score in the 90th percentile in the SAT, and around the 80th percentile in the LSAT. If you take into account that people taking those tests are more prepared for them then the general population they're probably in the 99th percentile. It doesn't matter if you mark wrong answers negative if it's getting 95% of the answers correctly and your average percent is getting 50% of the answers correctly.
People guess things too and will also state things confidently that they don't completely know. If a person has a little bit of knowledge on a subject they are likely to give confidently wrong answers due to the dunning Krueger effect. If you pick a random person you're probably just as likely to get one of these people as you are that the LLM is wrong. So is it more useful to ask something that has a 95% chance to be correct, and 5% chance to be confidently wrong, or ask a person who has a 50% chance of being correct, that includes those who guessed correctly, 5% chance of being confidently wrong and a 45% chance of saying I don't know.
If you're doubting my percentages on the accuracy of LLMs I'd encourage you to test them yourself. See if you can stump it on declaritive knowledge, it's harder than the posts make it seem.
And look at what is on my feed just this morning: https://lemmy.world/post/44099386
It's not just that LLMs are shit. It's that people trust them way too much and are shocked when the predictable happens.
And of course the AI bro goes for the "vibes" argument. You can't just state that as true without providing a source. Or did AI tell you it was true?
For example: fewer than 10% of tested AIs consistently properly answered that you need to drive to a car wash in order to wash your car: https://opper.ai/blog/car-wash-test
That's a question so far below anything on the SAT or LSAT and 90% of LLMs can't even get that right.
I've tried using LLMs. I don't use them for research, because why the fuck would I? Better, more efficient tools already exist for that. When I had something that a search engine can't help me with and LLMs are apparently "good at" it immediately proved itself to be worthless.
Here's the source it's from open AI but it is peer reviewed. Here's another source that uses it as a baseline to compare the relative scores and according to the tables in 2023 it got a 610, putting it around the 75th percentile, and that's just for math which the open AI study showed it did about 5% worse then it's average so ~80th percentile for a total score. Again this is for students who are usually more prepared for the SAT than the general population, so it's still probably in the 90th percentile for the general population.
Again for the car wash example that is not declaritive knowledge, like the pizza glue that is knowledge derived from experience and reason which I've said that LLMs aren't the best at. The fact that they had to make a riddle for the AI to trip it up if anything shows how good it is. If it was as bad as you say it is then anyone could easily trip it up and get it to give a wrong answer and a study like that wouldn't be relevant. Seriously if you think the LLM is so inaccurate, come up with your own test to stump it, it should be easy by the way you talk about them.
"I want to take my car to the car wash, should I walk or drive" is not a riddle. It requests basic understanding of what is being asked.
It's "Large Language Model", and the point is in "Large" and that on really large datasets and well-selected attention dimensions set it's good at extrapolating language describing real world, thus extrapolating how real world events will be described. So the task is more of an oracle.
I agree that providing anything accurate is not the task. It's the opposite of the task, actually, all the usefulness of LLMs is in areas where you don't have a good enough model of the world, but need to make some assumptions.
Except for "diagnose these symptoms", with proper framework around it (only using it for flagging things, not for actually making decisions, things that have been discussed thousands of times) that's a valid task for them.
This sounds like someone who knows nothing about construction saying "building a house" is a valid task because they don't understand why using a hammer to drive in a screw would be incorrect or why it's even a problem. "The results are good enough right?"
You are writing pretentious nonsense, go someplace else.
A lot of fields don't require doctorate levels of expertise to render effective business services. I've seen first hand companies replace thousands of employees and shutter divisions because their AI counterpart has been doing the job quantitatively equally, and faster. Perfect is the enemy of good enough, in most cases, as they say.
Lemmy is filled to the brim with llm haters but you're not only a minority, you're probably also closing doors on the future trajectory of tech in business.
"Think of the shareholder value of firing all these people!"
Also, I call bullshit. I've seen many cases of companies replacing their staff with AI, then a month later desperately trying to hire staff again because the AI is good at "looking like* it can do the job but once in use turns out it's complete shit.
This is of course problematic, but not directly the fault of the technology itself. The entire system is problematic, but that's a digression from the effectiveness of the tech doing the job.
And the instances I'm talking about were running the ai stack and employee teams in parallel for nearly a year. The replacement wasn't a "yeah let's try this... whoops that didn't work". It was a tried and tested approach, and the employees made redundant (in the capability sense, not the firing sense, which followed afterwards).
And I give it less then a year before the "oh shit, we really should have human's overseeing this" hits
perhaps but one example, Commonwealth Bank (largest Australian Bank and in the top 10 worldwide AFAIK) in Australia said they were dismissing 1000's of staff because of AI, turned out they were just offshoring. The latter is seen positively apparently, the former not so much.
I like local LLMs as much as the next person but the issue is that doesn't scale the way companies need it to.
As a personal assistant? Sure, I agree. They're useful at times. But as soon as you need multiple to run simultaneously you're gonna hit resource issues.
What Oracle and others were banking on is that you have engineers and others running a lot of agents in parallel composing different things together. Or having one input that multiple serverside agents take and execute numerous tasks on. That's something you can't run on an individual machine right now. And with the way they currently work I don't envision they will anytime soon.
There are lightweight models as good as some heavier ones. It's a bit like Intel's tick-tock advertised process. Heavy memory-hungry models are "tick", but there's "tock"- say, "lfm2.5-thinking" model, the light version, in the ollama repository seems almost as good as qwen3.5 for me, except it's very lightweight and lightning-fast compared to that.
These things are being optimized. It's just that in the market capture phase nobody bothered.
That they are not being used correctly - yeah, absolutely, my idea of their proper use is some graph-based system with each node being processed by a select LLM (or just piece of logic) with select set of tools and actions and choices available for each. A bit like ComfyUI, but something saner than a zoom-based web UI. Like MacOS Automator application, rather.
Even if local models are good, the big companies are making local computing more expensive than cloud tokens by colluding with ram and storage makers to restrict supply.
More expensive, but still autonomous which is very precious.
Perfect time for some foreign company to eat Oracles lunch.