As artificial intelligence adoption accelerates across Indian enterprises, the conversation is rapidly shifting from experimentation to execution. From GPU shortages and spiralling infrastructure costs to data sovereignty and energy efficiency, organisations and their channel partners are being forced to make far more nuanced decisions about AI infrastructure.
In an interaction with Vinay Chhabra, Co-founder and Managing Director of AceCloud (Real Time Data Services) the focus is clear: AI success will not come from chasing the biggest GPUs or the newest architectures, but from choosing the right infrastructure for the right workload, and preparing for a future shaped by agentic AI and inference-led demand.
Right-Sizing GPUs: Performance Without Overpaying
One of the most pressing challenges for AI adoption today is balancing performance, availability, and cost amid ongoing GPU constraints. The industry has fallen into a trap of assuming that newer or larger GPUs are always better.
“You don’t need a crane to lift a pen. Similarly, you don’t need the biggest and the best GPUs; you need the right GPU for the right workload,” he explains.
AceCloud works closely with partners and customers to map workloads to appropriate GPU classes. For example, L4 GPUs, despite being lower in VRAM, are highly effective for media streaming because of their encoding and decoding capabilities. “For SLMs, small language models, we have customers running three to four models on the same 24GB GPU,” he notes.
Legacy GPUs also continue to play a role. “A100 is still relevant because many customers have existing code and don’t want to change it,” he says, adding that even smaller GPUs like A2 remain meaningful for inference workloads where models are compact and predictable.
Beyond hardware selection, optimisation techniques such as GPU slicing further reduce cost. “An A100 with 80GB VRAM can be sliced into smaller units, maybe 10GB each. That’s another way of optimising cost,” he adds.
Read More: DQ Channels