IBM has a surprisingly sane approach to LLMs:
-
Small models
-
Economically trained
-
Apache licensed open weights
-
Geared for tool usage/RAG
-
Well documented
-
Legal, licensed training data
-
Open experiment artifacts, including A/B tests
See: https://huggingface.co/ibm-granite
No nebulous promises, no existential hype, no scorching the environment, no underbaked user facing disasters. Just plain locally runnable tools.
It’s so sensible it hurts. This is what all “AI” companies should be doing, albeit with a little more budget, and more modern architectures than dense GQA.