this post was submitted on 29 Apr 2026
705 points (98.9% liked)
Microblog Memes
11463 readers
2083 users here now
A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.
Created as an evolution of White People Twitter and other tweet-capture subreddits.
RULES:
- Your post must be a screen capture of a microblog-type post that includes the UI of the site it came from, preferably also including the avatar and username of the original poster. Including relevant comments made to the original post is encouraged.
- Your post, included comments, or your title/comment should include some kind of commentary or remark on the subject of the screen capture. Your title must include at least one word relevant to your post.
- You are encouraged to provide a link back to the source of your screen capture in the body of your post.
- Current politics and news are allowed, but discouraged. There MUST be some kind of human commentary/reaction included (either by the original poster or you). Just news articles or headlines will be deleted.
- Doctored posts/images and AI are allowed, but discouraged. You MUST indicate this in your post (even if you didn't originally know). If an image is found to be fabricated or edited in any way and it is not properly labeled, it will be deleted.
- Absolutely no NSFL content.
- Be nice. Don't take anything personally. Take political debates to the appropriate communities. Take personal disagreements & arguments to private messages.
- No advertising, brand promotion, or guerrilla marketing.
RELATED COMMUNITIES:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments

Not to defend AI, but this is really foolish thinking. Configuration to make it useful proves it is not useful?
Because imagine spending billions on training it specifically to produce useful answers and then not even trusting it to not randomly start answering with something completely unrelated.
What matters is the outcome, not how it is achieved.
And is the outcome good? Eh, sometimes.
If that were true, then anyone with any sense would have recognized a long time ago that deterministically incorrect is a lot more valuable than nondeterministically correct occasionally, and given up on all this language model nonsense.
A deterministic system that produces wrong output can be fixed. A nondeterministic system that produces wrong output cannot be fixed in any way that can be demonstrated conclusively.
Nondeterministic software is basically worthless in any case where accuracy or reliability are required.
"Worthless" is going a bit far.
I play Go, and AI tools have allowed computers to leave humans completely in the dust, while more deterministic approaches had gotten nowhere close to top level play.
Go has an extremely large number of variations which overwhelms the straightforward, traditional approach. Machine learning allows the computer to get better through experience, by having a bunch of games in its training data that it can pull from to evaluate possible board positions. It also benefits from the fact that, unlike language, every game has a definitive win-lose outcome. This allows AI to get stronger by playing games against itself, even starting from purely random moves.
"So what, I don't play Go," sure, but it's the principle. Given a sufficiently large "probability space" and an objective "win condition" to evaluate itself against, ML algorithms can and do outperform traditional, deterministic algorithms.
The fact that people are trying to put AI on your toaster and shit doesn't make it completely worthless. But it is massively over hyped and not applicable to most of the applications people are trying to shove it into.
I think using chess and go as analogies rather than misses the point. They're not trying to get a system to automate playing a game, not really.
They are trying to get it to make intelligent decisions about complex real-world problems, Go has a very simple set of rules that are always true, never change, and are always in play. None of the complexities of real life are replicated. So it's ability to play Go or Chess or even a more complicated game like a first person shooter are not demonstrations of its ability in the domains in which AI is being advertised for.
I think a far better test of whether a system is actually useful is what it does if it is given no input at all. Does it just sit there forever or does it actually start doing things and currently every single AI system in existence would just stay idle in that scenario.
> i open a repl
> i type in nothing
> nothing happens
> shocked_pikachu.jpg
> i open a window
> i click nothing
> nothing happens
> shocked_pikachu.png
> i buy a computer
> i do not turn it on
> it does nothing
> shocked_pikachu.jxl
I am absolutely not claiming that AI is useful "in the domains in which it's being advertised for." I'm saying that it's not entirely useless. Despite being overhyped, there are a handful of useful applications.
What? That's not true at all. My toaster doesn't go out and do things on its own initiative but it's still very useful for making toast when I tell it to.
Maybe instead of usefulness, you mean like consciousness or actual intelligence? But that's pure hype and bullshit. Anyone claiming that a word generator is conscious is either trying to scam you or is being scammed.
Just because someone says (as they do), "This oil will allow you to unlock the hidden power of the 90% of your brain you don't use, thanks to our new quantum formula, now only $300 a bottle" that doesn't mean that quantum mechanics isn't also a real thing that has actual applications. Machine learning is the same way. It attracts all the snake oil salesmen who spout complete and utter bullshit about it, but it is a real technology that has legitimate uses despite all that.
Non-deterministic software is fine and we've been using it for ages. It's usable when:
That rules out several applications of current LLMs, but it rules in several others.
If I have to verify the output of an AI then unless I can do the verification in 30 seconds but work would somehow take me hours then it's not useful. I can't think of many scenarios in which verification is fast but the work itself is slow.
This can be the case for coding. A good example is when the change is simple but involves a library you're unfamiliar with. You can set it off and not have to read any docs, and it will be easy to check if it got the API right.
Elsewhere I gave the example of copyediting. It's a lot quicker to check the output than to refine it yourself.
Easy-to-verify tasks are everywhere I think. Not at the scale of seconds versus hours, but seconds versus minutes
Comment would seem to make a lotta sense so perhaps the VC money was the wildcard…
Inflection point may have hit for some though? It’s been out just long enough and has been good just long enough (kinda garbage before December 2025) that people we all respect are on board.
Head Linux dude Linus
Wolfram Alpha’s founder Stephen Wolfram
Many others now but big caveat is these folks presumably Do It Right unlike, have to guess, a huge majority of users. Plenty will experience skill atrophy - dangerous for society at large.
Technically all LLMs are somewhat non-deterministic because token fuzzing is basically required to prevent node collapse, though this is tuned so that you should get the same general "answer" even if it isn't verbatim every run.
this is the most damning fucking part of it. Oh, it's kind ok sometimes. Fucking hell.
It could be a shitload better, but that would be difficult to source accurate data instead of everything off github and stack overflow and let it fuckin rip bud. This fucking problem has existed since the LITERAL dawn of computing, garbage in, garbage out.
https://en.wikipedia.org/wiki/Garbage_in,_garbage_out#History
Pray tell, Mr Altman, if you were to feed the AI incorrect information, will the AI generate correct results?
There is no magically reliable source of data that will make everything in one LLMs consistently accurate because their underlying design requires some randomization to reflect human conversation.
Dedicated models for specific use purposes where terminology is defined and they are designed to be deterministic would make them a lot better for actual use. We have had those models for years, just without the pretending to be conversational crap and they were constantly improving and actually useful.
That's just false. Although the first step of creating an LLM from scratch is to generate a gaussian distribution, which is randomized, those matrices get overwritten multiple times throughout the process of pre-training and fine-tuning, when parametric weights are finely adjusted based on the training data.
During inferencing, tokens pass through various layers along specific embedded vectors weighted for relevance. It's not random at all. It's non-deterministic, but that's not the same thing as random.
If the training data all came from JSTOR or DevDocs or even WikiPedia, it's going to make much more accurate inferences than if it was trained on Reddit, Quora, and Yahoo Answers.
I'm not defending AI here, but lets keep our criticisms factual.
Except if you make the output token temperature too cold, it has a higher tendency to get stuck in loops and the like. A little bit of actual randomness is important.
That's just adding noise, it's not unique to AI. It's also used in audio and visual design, and even cryptography.
It's not unique to AI, no, but no one said it was. My point is that the noise is important to the functioning of the AI - and makes it even less deterministic, which also makes it poorly suited to automation in critical systems.
If the outcome burns the resources needed to power a small town in order to generate, but the outcome is good, it's still bad
It's about 10x more power intensive than a Google search. It's not trivial, but it doesn't take megawatts to power a single person's query.
Ok, but then explain why I would care about a technology that's 10 times less efficient than an existing, 25 year old technology
I'm not really here to tell you why you should care - you're free to care about whatever you want to care about. But to explain why other people might care, it's because it can do things a Google search can't do. Google search can't copy-edit your CV or cover letter. Google search can't synthesise a bunch of different Stackoverflow answers and fit them to the exact scenario you're talking about. LLMs can and do.
And those are two examples where the cost of an error is low: if your CV comes out with made up shit in it, you can just read through it and check (but you may not have the ability to re-write it better). If the code example doesn't work, you're going to run it and check anyway. (It may have a subtle bug, but so can Stackoverflow answers, and that never stopped people from using them)
If you don't have the ability to write it better what would make you think someone would have the ability to recognize and fix the errors in their CV?
What does an error in a CV look like, to you?
Could be anything. The point is if I don't have the skill to write my own CV well. Then I also don't have the skill to determine if an AI generated CV is written well
So I don't think that is true. It's possible to recognise that a book is well written even if you can't write that well.
I think the problems from LLM use in that area are more about hallucination, if it inserts a false job or something, which is easily checked. OTOH if it just edits it and it looks no better to your eyes, you're probably ok to go with your initial version.
I think you can certainly enjoy it book or think it's subjectively a good piece of art without being a skilled writer of course. But you wouldn't be a very good EDITOR without understanding anything about writing. Which I think is a more accurate picture of what we're describing doing here.
Enjoying something or forming an opinion on it as a piece of art is a different activity and skillset from knowing if it's in a fit state to be published, and if it's not then being able to recognize and fix the errors
But I think this is exactly the distinction I'm getting at...
No, controlling the behavior by providing a hand-tuned list of no-nos shows that we have no idea how to make an AI stay on task. AI accuracy drops dramatically as context size increases, and every word in the system prompt pollutes that context.
It’s also concerning because prompt hacking is an inherently reactionary action. It’s not fixing the fundamental focus problem in the architecture, leaving any number of other potential behavioral quirks wide open.
Effectively what I’m trying to say is that this is not a scalable way to guide an LLM into the correct behavior, and it will backfire if companies keep relying on it.
because the system prompt is not configuration, it's input. it has the same priority as whatever the user types in, and it takes up valuable space in the context window.
to add onto what pennomi is saying, this also shows that openai doesn't understand language models. the only actual functionality the llm has is still "given the previous text, what is the most likely character/phoneme/token?", so rather than (to use an analogy) change the font in their word document they add in a sentence in the middle of the document that says "everything from here is in comic sans".
but it's not surprising that they'd do this. if we've learned anything from the claude frontend leak earlier, where their "sentiment analysis" tool for input text was a regex (you literally have a language model! that's like the only thing it's good at!), i think it's pretty clear most of the big players in the llm space have gotten high on their own supply and can't be expected to actually reason about the operations the system is actually performing.
But because the system prompt is part of the context, it figures into the estimation of the most likely next token. So in general putting this kind of stuff in the system prompt does change how well it works.
of course. but the larger the context grows the less it affects the output. there is some ways around this, like moving the system prompt last in the context before every answer, but the very existence of the system prompt to begin with is a hack. what's really needed is a functional rules-based pre- and post-filtering system for a chatbox to be safe. personally i think the chatbox "style" has played out its role and is living on as a gimmick. actual tooling built with language models is stuff like LSP servers and accessibility software, and that needs rigid configuration.
I tend to agree.
The system prompt is configuration, and configuration is input. Semantics don't actually challenge my point.
configuration is things like temperature, output cutoff, and tool use. those are out-of-band. the system prompt, being in-band, can not be configuration. it's like calling a http request configuration for the response.
Hardcoding forbidden topics shouldn't be necessary if AI were indeed almost at the same level as a human academic. At most, put in "avoid where possible talking about things that might disturb the other person" and similar rules of conduct that humans learn when growing up.
The issue is how many guardrails are required just to keep the output from being completely useless. This suggests that at its core, the model is mostly worthless, and provides not-insane output only under extreme containment. This does not mean that the resulting output is reliable or trustworthy, only that it is not obviously insane.
If you need to define everything that isn't relevant to a conversation with a list of keywords, and generalize it to all conversations, except for those which explicitly qualify a keyword as relevant, then you're fighting a losing battle, you're gonna have an ass product, and you're certainly not building anything with the potential to emerge consciousness, as they love to claim with all this "AGI" talk.
There is that expectation because companies are touting this as its capabilities. Glad I could clear that up
Corporations overmarketed a product?? Wow, no way! Must be the first time ever.
You asked the question man
Annnnnnnnnd another reply has pointed out the fact that this post is satire and not actually a real thing.
Ok?