Nvidia has introduced Blackwell Ultra, a new version of its flagship AI chip platform, and says it can deliver up to 30% more inference throughput for large language models than the prior generation. The company said the performance gain comes with roughly similar power consumption, a key selling point as data centers face rising energy costs.
The announcement extends Nvidia’s lead in the AI hardware market, where cloud providers and enterprise customers are racing to expand model deployment and cut response times. Inference, the stage where trained AI systems generate answers or predictions, has become a major battleground as companies move from model development to large-scale use.
Nvidia has positioned Blackwell Ultra as part of a broader effort to improve efficiency across AI infrastructure, including chips, networking, and software tools. The company is betting that stronger performance per watt will help persuade buyers to upgrade fleets even as competitors push alternative accelerators into the market.
The new architecture arrives as demand for AI compute remains high, but customers are also scrutinizing costs, power use, and supply constraints more closely. Nvidia did not release full independent benchmark results in the announcement, so the company’s performance claims should be viewed as vendor-provided until confirmed in broader testing.
Comments
Top comments