this post was submitted on 07 Dec 2025

983 points (98.0% liked)

Technology

77084 readers

2763 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

983

I Went All-In on AI. The MIT Study Is Right. (open.substack.com)

submitted 1 day ago* (last edited 1 day ago) by AutistoMephisto@lemmy.world to c/technology@lemmy.world

239 comments fedilink hide all child comments

Just want to clarify, this is not my Substack, I'm just sharing this because I found it insightful.

The author describes himself as a "fractional CTO"(no clue what that means, don't ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

you are viewing a single comment's thread
view the rest of the comments

[–] Suffa@lemmy.wtf 32 points 1 day ago (1 children)

AI is really great for small apps. I've saved so many hours over weekends that would otherwise be spent coding a small thing I need a few times whereas now I can get an AI to spit it out for me.

But anything big and it's fucking stupid, it cannot track large projects at all.

[–] victorz@lemmy.world 10 points 1 day ago (4 children)

What kind of small things have you vibed out that you needed?

[–] utopiah@lemmy.world 10 points 18 hours ago (3 children)

FWIW that's a good question but IMHO the better question is :

What kind of small things have you vibed out that you needed that didn't actually exist or at least you couldn't find after a 5min search on open source forges like CodeBerg, Gitblab, Github, etc?

Because making something quick that kind of works is nice... but why even do so in the first place if it's already out there, maybe maintained but at least tested?

[–] victorz@lemmy.world 6 points 17 hours ago (1 children)

Since you put such emphasis on "better": I'd still like to have an answer to the one I posed.

Yours would be a reasonable follow-up question if we noticed that their vibed projects are utilities already available in the ecosystem. 👍

[–] utopiah@lemmy.world 1 points 17 hours ago (1 children)

Sure, you're right, I just worry (maybe needlessly) about people re-inventing the wheel because it's "easier" than searching without properly understand the cost of the entire process.

[–] victorz@lemmy.world 1 points 13 hours ago

Very valid!

[–] lepinkainen@lemmy.world 2 points 16 hours ago (1 children)

What if I can find it but it’s either shit or bloated for my needs?

[–] utopiah@lemmy.world -1 points 16 hours ago* (last edited 16 hours ago) (1 children)

Open an issue to explain why it's not enough for you? If you can make a PR for it that actually implements the things you need, do it?

My point to say everything is already out there and perfectly fits your need, only that a LOT is already out there. If all re-invent the wheel in our own corner it's basically impossible to learn from each other.

[–] lepinkainen@lemmy.world 4 points 16 hours ago (1 children)

These are the principles I follow:

https://indieweb.org/make_what_you_need

https://indieweb.org/use_what_you_make

I don’t have time to argue with FOSS creators to get my stuff in their projects, nor do I have the energy to maintain a personal fork of someone else’s work.

It’s much faster for me to start up Claude and code a very bespoke system just for my needs.

I don’t like web UIs nor do I want to run stuff in a Docker container. I just want a scriptable CLI application.

Like I just did a subtitle translation tool in 2-3 nights that produces much better quality than any of the ready made solutions I found on GitHub. One of which was an *arr stack web monstrosity and the other was a GUI application.

Neither did what I needed in the level of quality I want, so I made my own. One I can automate like I want and have running on my own server.

[–] mjr@infosec.pub 0 points 10 hours ago

So the claim is it's easier to Claudge a whole new app than to make a personal fork of one that works? Sounds unlikely.

[–] jj4211@lemmy.world 1 points 17 hours ago (1 children)

So if it can be vibe coded, it's pretty much certainly already a "thing", but with some awkwardness.

Maybe what you need is a combination of two utilities, maybe the interface is very awkward for your use case, maybe you have to make a tiny compromise because it doesn't quite match.

Maybe you want a little utility to do stuff with media. Now you could navigate your way through ffmpeg and mkvextract, which together handles what you want, with some scripting to keep you from having to remember the specific way to do things in the myriad of stuff those utilities do. An LLM could probably knock that script out for you quickly without having to delve too deeply into the documentation for the projects.

[–] utopiah@lemmy.world 1 points 17 hours ago (1 children)

If I understand correctly then this means mostly adapting the interface?

[–] jj4211@lemmy.world 1 points 16 hours ago (1 children)

It's certainly a use case that LLM has a decent shot at.

Of course, having said that I gave it a spin with Gemini 3 and it just hallucinated a bunch of crap that doesn't exist instead of properly identifying capable libraries or frontending media tools....

But in principle and upon occasion it can take care of little convenience utilities/functions like that. I continue to have no idea though why some people seem to claim to be able to 'vibe code' up anything of significance, even as I thought I was giving it an easy hit it completely screwed it up...

[–] PoliteDudeInTheMood@lemmy.ca 1 points 13 hours ago (1 children)

Having used both Gemini and Claude.... I use Gemini when I need to quickly find something I don't want to waste time searching for, or I need a recipe found and then modified to fit what I have on hand.

Everytime I used Gemini for coding has ended in failure. It constantly forgets things, forgets what version of a package you're using so it tells you to do something that is deprecated, it was hell. I had to hold its hand the entire time and talk to it like it's a stupid child.

Claude just works. I use Claude for so many things both chat and API. I didn't care for AI until I tried Claude. There's a whole whack of novels by a Russian author I like but they stopped translating the series. Claude vibe coded an app to read the Russian ebooks, translate them by chapter in a way that prevented context bleed. I can read any book in any language for about $2.50 in API tokens.

[–] jj4211@lemmy.world 1 points 12 hours ago (1 children)

I've been using Claude to mediocre results, so this time I used Gemini 3 because everyone in my company is screaming "this time it works, trust us bro". Claude has not been working so great for me for my day job either.

[–] PoliteDudeInTheMood@lemmy.ca 1 points 7 hours ago (1 children)

I think it really depends on the user and how you communicate with the AI. People are different, and we communicate differently. But if you're precise and you tell it what you want, and what your expected result should be it's pretty good at filling in the blanks.

I can pull really useful code out of Claude, but ask me to think up a prompt to feed into Gemini for video creation and they look like shit.

[–] jj4211@lemmy.world 1 points 5 hours ago* (last edited 5 hours ago)

The type of problem in my experience is the biggest source of different results

Ask for something that is consistent with very well trodden territory, and it has a good shot. However if you go off the beaten path, and it really can't credibly generate code, it generates anyway, making up function names, file paths, rest urls and attributes, and whatever else that would sound good and consistent with the prompt, but no connection to real stuff.

It's usually not that that it does the wrong thing because it "misunderstood", it is usually that it producea very appropriate looking code consistent with the request that does not have a link to reality, and there's no recognition of when it invented non existent thing.

If it's a fairly milquetoast web UI manipulating a SQL backend, it tends to chew through that more reasonably (though in various results that I've tried it screwed up a fundamental security principle, like once I saw it suggest a weird custom certificate validation and disable default validation while transmitting sensitive data before trying to meaningfully execute the custom valiidation.

[–] 6nk06@sh.itjust.works 11 points 21 hours ago (1 children)

I'm curious about that too since you can "create" most small applications with a few lines of Bash, pipes, and all the available tools on Linux.

[–] victorz@lemmy.world 4 points 20 hours ago (1 children)

Maybe they don't run Linux. 🤭

[–] mjr@infosec.pub 3 points 10 hours ago

Perverts!

[–] MrScottyTay@sh.itjust.works 23 points 1 day ago

Encryption, login systems and pricing algorithms. Just the small annoying things /s

[–] CrabAndBroom@lemmy.ml 2 points 17 hours ago (1 children)

Not OP but I made a little menu thing for launching VMs and a script for grabbing trailers for downloaded movies that reads the name of the folder, finds the trailer and uses yt-dlp to grab it, puts it in the folder and renames it.

[–] victorz@lemmy.world 2 points 13 hours ago (2 children)

Definitely sounds like a tiny shell script but yeah, I guess it's seconds with an agent rather than a few minutes with manual coding 👍

[–] CrabAndBroom@lemmy.ml 2 points 12 hours ago (1 children)

Yeah pretty much! TBH for the first one there are already things online that can do that, I just wanted to test how the AI would do so I gave it a simple thing, it worked well and so I kept using it. The second one I wasn't sure about because it's a bit copyright-y, but yeah like you say it was just quicker. I wouldn't use the AI for anything super important, but I figured it'd do for a quick little script that only needs to do one specific thing just for me.

[–] victorz@lemmy.world 1 points 2 hours ago

I would need to inspect every line of that shit before using it. I'd be too scared that it would delete my entire library, like that dude who got their entire drive erased by Google Antigravity...

[–] Nalivai@lemmy.world 1 points 12 hours ago (1 children)

It never seconds. The first three versions will don't do what you want (or not work at all), so you will end up arguing with this shit for significant amount of time without realising it

[–] victorz@lemmy.world 1 points 2 hours ago

😅