And an LLM that you could run local on a flash drive will do most of what it can do.
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
Probably not a flash drive but you can get decent mileage out of 7b models that run on any old laptop for tasks like text generation, shortening or summarizing.
What do you use your usb drive llm for?
Help me out here. What designates the “response” type? Someone asking it to make a picture? Write a 20 page paper? Code a small app?
Response Type is decided by ChatGPTs new routing function based on your input. So yeah. Asking it to "think long and hard", which I have seen people advocating for to get better results recently, will trigger the thinking model and waste more resources.
So instead of just saying "thank you" I now have to say "think long and hard about how much this means to me"?
The team measured GPT-5’s power consumption by combining two key factors: how long the model took to respond to a given request, and the estimated average power draw of the hardware [they believe is] running it.
Fucking Doc Brown could power a goddamn time machine with this many jiggawatts, fuck I hate being stuck in this timeline.
I have an extreme dislike for OpenAI, Altman, and people like him, but the reasoning behind this article is just stuff some guy has pulled from his backside. There's no facts here, it's just "I believe XYX" with nothing to back it up.
We don't need to make up nonsense about the LLM bubble. There's plenty of valid enough criticisms as is.
By circulating a dumb figure like this, all you're doing is granting OpenAI the power to come out and say "actually, it only uses X amount of power. We're so great!", where X is a figure that on its own would seem bad, but compared to this inflated figure sounds great. Don't hand these shitty companies a marketing win.
that's a lot. remember to add "-noai" to your google searches.
Or just use any other better search like Bing or duckduckgo. googol sucks and was never any good. Quit pushing ignorant garbage.
duckduckgo yes, but ... bing?
ddg is bing
Bing is for porn.
I'm just going to ignore the AI recommendations, let them burn money.
i don't judge you for that. honestly it matters fuck all at this point
I don't care how rough the estimate is, LLMs are using insane amounts of power, and the message I'm getting here is that the newest incarnation uses even more.
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
And water usage which will also increase as fires increase and people have trouble getting access to clean water
https://techhq.com/news/ai-water-footprint-suggests-that-large-language-models-are-thirsty/
It would only take one regulation to fix that:
Datacenters that use liquid cooling must use closed loop systems.
The reason they dont, and why they setup in the desert, is because water is incredibly cheap and energy to cool a closed loop system is expensive. So they use evaporative open loop systems.
Unfortunately I wonder if it’s more expensive to set up a closed loop system that’s really expensive or to buy lawmakers that will vote against bills saying you should do so and it’s a tale old as time
Politicians are cheap
Yeah sorry forgot my /s there
BTW a lot of it seems to be just inefficient coding as Deepseek has shown.
Kind of? Inefficient coding is definitely a part of it. But a large part is also just the iterative nature of how these algorithms operate. We might be able to improve that via code optimization a little bit. But without radically changing how these engines operates it won't make a big difference.
The scope of the data being used and trained on is probably a bigger issue. Which is why there's been a push by some to move from LLMs to SLMs. We don't need the model to be cluttered with information on geology, ancient history, cooking, software development, sports trivia, etc if it's only going to be used for looking up stuff on music and musicians.
But either way, there's a big 'diminishing returns' factor to this right now that isn't being appreciated. Typical human nature: give me that tiny boost in performance regardless of the cost, because I don't have to deal with. It's the same short-sighted shit that got us into this looming environmental crisis.
Coordinated SLM governors that can redirect queries to the appropriate SLM seems like a good solution.
That basically just sounds like Mixture of Experts
Basically, but with MCP and SLMs interacting rather than a singular model, with the coordinator model only doing the work to figure out who to field the question to, and then continuously provide context to other SLMs in the case of more complex queries
My guess would be that using a desktop computer to make the queries and read the results consumes more power than the LLM, at least in the case of quickly answering models.
The expensive part is training a model but usage is most likely not sold at a loss, so it can't use an unreasonable amount of energy.
Instead of this ridiculous energy argument, we should focus on the fact that AI (and other products that money is thrown at) aren't actually that useful but companies control the narrative. AI is particularly successful here with every CEO wanting in on it and people afraid it is so good it will end the world.
I think AI power usage has an upside. No amount of hype can pay the light bill.
AI is either going to be the most valuable tech in history, or it's going to be a giant pile of ash that used to be VC capital.
It will not go away at this point. Too many daily users already, who uses it for study, work, chatting, looking things up.
If not OpenAI, it will be another service.
Those users are not paying a sustainable price, they're using chatbots because they're kept artificially cheap to increase use rates.
Force them to pay enough to make these bots profitable and I guarantee they'll stop.
Those same things were said about hundreds of other technologies that no longer exist in any meaningful sense. Current usage of a technology, which in this specific case I would argue is largely frivolous anyway, is not an accurate indicator of future usage.
Tech hasn't improved that much in the last in the last decade. All that's happened is that more cores have been added. The single-thread speed of a CPU is stagnant.
My home PC consumes more power than my Pentium 3 consumed 25 years ago. All efficiency gains are lost to scaling for more processing power. All improvements in processing power are lost to shitty, bloated code.
We don't have the tech for AI. We're just scaling up to the electrical senand demand of a small country and pretending we have the tech for AI.
Not even the ai tech itself is enough for ai
It's the muscle car era: can't make things more efficient to compete with Asia? MAKE IT BIGGER AND CONSUME MORE
This is nonsense, an M1 runs many multiples faster and at much lower wattage.
That's alright. When they've got a generation of people who can't even hold a conversation without it, let alone do a job, that price increase will drop that energy use pretty rapidly.
Bit of a clickbait. We can't really say it without more info.
But it's important to point out that the lab's test methodology is far from ideal.
The team measured GPT-5’s power consumption by combining two key factors: how long the model took to respond to a given request, and the estimated average power draw of the hardware running it.
What we do know is that the price went down. So this could be a strong indication the model is, in fact, more energy efficient. At least a stronger indicator than response time.