Disrupting THE BOX

Disrupting THE BOX

Share this post

Disrupting THE BOX
Disrupting THE BOX
Why Small Language Models Are the Better Fit for Real-World AI

Why Small Language Models Are the Better Fit for Real-World AI

Small language models offer faster, cheaper, and more efficient AI solutions. As scaling hits limits, are SLMs the future of real-world AI?

Neil Sahota's avatar
Neil Sahota
Apr 08, 2025
∙ Paid
2

Share this post

Disrupting THE BOX
Disrupting THE BOX
Why Small Language Models Are the Better Fit for Real-World AI
Share

Source: Freepik.com/

Why Small Language Models Are the Smart Choice for Practical AI

AI models keep getting bigger, with companies building $10 billion data centers to support them. But is scaling up the only path forward? Pre-training as we know it is reaching its limits, and the focus is shifting toward refining existing methods and improving algorithms instead of just making models larger.

Small language models (SLMs) with up to 10 billion parameters are emerging as a strong alternative. More companies are adopting this approach, recognizing that efficiency and optimization matter just as much as scale.

However, the choice between SLMs and large language models (LLMs) isn’t either-or – it’s about knowing when each is the right tool for the job. Both have their place, but SLMs offer lower costs and greater task-specific AI solutions. They can be deployed in real-world applications without massive infrastructure, making them the practical AI choice for most use cases.

Why AI Scaling Alone Is No Longer Sustainable

AI models are consuming more energy than ever, with training alone requiring power on the scale of entire countries. In 2024, for example, GPT-3 used more than 1,200 megawatt-hours of power during training, more than many smaller models used in a year.

Recent reviews have found that, due to the widespread use of AI, data centers will need between 325 and 580 terawatt-hours (TWh) of power annually by 2028. This consumption may account for 6.7% to 12% of the total power used in the US, putting its long-term viability in doubt.

Concerns about the environment are further heightened by data centers' reliance on limited resources and high water use for cooling. Infrastructure limitations are another problem. Projects like Musk’s Memphis supercomputer need 150 megawatts of power and over a million gallons of water daily. Meanwhile, performance gains from scaling are slowing, making the cost of larger models harder to justify.

The shift is now toward smaller, more efficient models. For example, DeepSeek R1 (LINK) delivers strong performance at a fraction of traditional training costs. Instead of building ever-larger systems, the focus is moving to smarter, resource-efficient AI that meets real-world needs without overwhelming infrastructure.

Video source: YouTube/WALLPIE Space and Tech

Keep reading with a 7-day free trial

Subscribe to Disrupting THE BOX to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Neil Sahota
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share