this post was submitted on 30 Sep 2025
1133 points (98.5% liked)
Technology
75682 readers
3239 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I have 3 questions, and I'm coming from a heavily AI-skeptic position, but am open:
Do you believe that providing all that context, describing the existing patterns, creating an implementation plan, etc, allows the AI to both write better code and faster than if you just did it yourself? To me, this just seems like you have to re-write your technical documentation in prose each time you want to do something. You are saying this is better than 'Do XYZ', but how much twiddling of your existing codebase do you need to do before an AI can understand the business context of it? I don't currently do development on an existing codebase, but every time I try to get these tools to do something fairly simple from scratch, they just flail. Maybe I'm just not spending the hours to build my AI-parsable functional spec. Every time I've tried this, asking something as simple as (and paraphrased for brevity) "write an Asteroids clone using JavaScript and HTML 5 Canvas" results in a full failure, even with multiple retries chasing errors. I wrote something like that a few years ago to learn Javascript and it took me a day-ish to get something that mostly worked.
Speaking of that context. Are you running your models locally, or do you have some cloud service? If you give your entire codebase to a 3rd party as context, how much of your company's secret sauce have you disclosed? I'd imagine most sane companies are doing something to make their models local, but we see regular news articles about how ChatGPT is training on user input and leaking sensitive data if you ask it nicely and I can't imagine all the pro-AI CEOs are aware of the risks here.
How much pen-testing time are you spending on this code, error handling, edge cases, race conditions, data sanitation? An experienced dev understands these things innately, having fixed these kinds of issues in the past and knows the anti-patterns and how to avoid them. In all seriousness, I think this is going to be the thing that actually kills AI vibe coding, but it won't be fast enough. There will be tons of new exploits in what used to be solidly safe places. Your new web front-end? It has a really simple SQL injection attack. Your phone app? You can tell it your username is admin'joe@google.com and it'll let you order stuff for free since you're an admin.
I see a place for AI-generated code, for instant functions that do something blending simple and complex. "Hey claude, write a function to take a string and split it at the end of every sentence containing an uppercase A". I had to write weird functions like that constantly as a sysadmin, and transforming data seems like a thing an AI could help me accelerate. I just don't see that working on a larger scale, though, or trusting an AI enough to allow it to integrate a new function like that into an existing codebase.
Thank you for reading my comment. I'm on the train headed to work and I'll try to answer completely. I love talking about this stuff.
For my work, absolutely. My work is a lot of tickets that were setup from multiple stories and multiple epics. It would be like asking me if I am really framing a house faster with a nail gun and compressor. If I were just hanging up a picture or a few pictures in the hallway, it's probably faster to use a hammer than to set up the compressor and nail gun, plus cleanup.
However, a lot of that documentation already exists by the time it gets to me. All of the Design Documents and Product Requirement Documents have already been formed, discussed, and approved by our architecture team and team leads. Imagine if you already had this documentation for the asteroid game; how much better do you think your LLM would do? Maybe this is the benefit of using LLMs for development at an established company. Btw, a lot of those Documents were also created with the assistance of AI by the Product Team, Architects, and Principle/Staff/Leads anyway.
With the help of our existing documents and codebase(s) I feel I dont have any issues with the model knowing what we're doing. I do have to set up my own context for how I want it to be done. To me this is like explaining to a Junior Engineer what I need them to help me with. If you're familiar with "Know when to Direct, when to Delegate, or when to Develop" I would say it lands in between Direct and Delegate. I have markdown files with my rules and guidelines and provide that as context. I use Augment Code which is pretty good with codebase context.
I would try "Let's plan out the steps needed to write an Asteroids game using JavaScript and HTML 5. Identify and explain each step of the development plan. The game must build with no errors, be playable, and pass all tests. Do not wrote code at this time until our plan is approved" Then once it comes back with the initial steps, I would guide it further if needed. Finally I would approve the plan and tell it to execute while tracking it's steps (Augment Code uses a task log).
We are required to use the frontier models that my employer has contracts with and are forbidden from using local models. In our enterprise contracts we have negotiated for no training on our data. I imagine we pay for that. I'm not involved in that level of interaction on the accounts.
We have other teams that handle a lot of these tasks. These teams are also using AI tools to get the job done. In addition, we have static testing tools on our repo like CodeRabbit and another one I can't remember the name of that looks specifically for security concerns. It will comment on the PR directly and our merge would be blocked until handled. Code coverage for testing is at 85% or it blocks the merge and we have a full QA department of Analysts and SDETs to QA. In addition to that we still have human approvals required (2 devs + Sr+). All of these people involved are still using AI tools to help them in each step.
I hope that answers your questions and gives you some insight into how I've found success in my experience with it. I will say that on my personal projects I don't go this far with process and I don't experience the same AI output that I do at work.
Thanks for your reply, and I can still see how it might work.
I'm curious if you have any resources that do some end-to-end examples. This is where I struggle. If I have an atomic piece of code I need and I can maybe get it started with a LLM and finish it by hand, but anything larger seems to just always fail. So far the best video I found to try a start-to-finish demo was this: https://www.youtube.com/watch?v=8AWEPx5cHWQ
He spends plenty of time describing the tools and how to use them, but when we get to the actual work, we spend 20 minutes telling the LLM that it's doing stuff wrong. There's eventually a prototype, but to get there he had to alternate between 'I still can't jump' and 'here's the new error.' He eventually modified code himself, so even getting a 'mario clone' running requires an actual developer and the final result was underwhelming at best.
For me, a 'game' is this tiny product that could be a viable unit. It doesn't need to talk to other services, it just needs to react to user input. I want to see a speed-run of someone using LLMs to make a game that is playable. It doesn't need to be "fun", but the video above only got to the 'player can jump and gets game over if hitting enemy' stage. How much extra effort would it take to make the background not flat blue? Is there a win condition? How to refactor this so that the level is not hard-coded? Multiple enemy types? Shoot a fireball that bounces? Power Ups? And does doing any of those break jump functionality again? How much time do I have to spend telling the LLM that the fireball still goes through the floor and doesn't kill an enemy when it hits them?
I could imagine that if the LLM was handed a well described design document and technical spec that it could do better, but I have yet to see that demonstrated. Given what it produces for people publishing tutorials online, I would never let it handle anything business critical.
The video is an hour long, and spends about 20 minutes in the middle actually working on the project. I probably couldn't do better, but I've mostly forgotten my javascript and HTML canvas. If kaboom.js was my focus, though, I imagine I could knock out what he did in well under 20 minutes and have a better architected design that handled the above questions.
I've, luckily, not yet been mandated that I embed AI into my pseudo-developer role, but they are asking.