Cloud Computing Revolution — Harness the Power of AI Today! — Gadgets Lead: Exploring the Latest Tech Trends

AMD Seeks to Outpace NVIDIA's CUDA with Performance-Enhancing ROCm 7 Software

Improved Inference by 3.5 times and Training Performance by 3 times over previous-generation software, claims House of Zen.

, and Administrator

2025 September 25 . 12:46 AM

2 min read

AMD Aims to Match NVIDIA's CUDA Performance with Performance-Enhancing ROCm 7 Software

AMD Seeks to Outpace NVIDIA's CUDA with Performance-Enhancing ROCm 7 Software

AMD Unveils ROCm 7.0: A Leap Forward in AI Performance

In a significant move, AMD has officially released its ROCm 7.0 software platform. Announced at the "Advancing AI 2025" event in mid-June, the final release was documented around mid-September 2025. This update promises major improvements in inference and training performance for AMD's MI300-series parts and the MI355X.

The MI355X, launched this spring, is AMD's latest GPU offering, designed to close the performance gap with Nvidia's Blackwell accelerators. In inference workloads, the MI355X achieves a 1.3x edge over Nvidia's B200 when running DeepSeek R1 in SGLang.

One of the key features of ROCm 7.0 is the introduction of AMD's AI Tensor Engine (AITER), which is aimed at maximising GenAI performance. AITER, when applied to models like DeepSeek R1, can boost throughput by more than 2x. It also offers a significant boost in MLA decode operations (by 17x) and MHA prefill ops (by 14x) for inference.

ROCm 7.0 offers a roughly 3.5x uplift in inference performance and a 3x boost in effective floating point performance in model training compared to ROCm 6. The support for OCP's microscaling datatypes in ROCm 7.0 cuts memory requirements by a factor of 2 to 4x.

The MI355X boasts 108 GB more HBM3e compared to Nvidia's B200. This increased memory capacity is a significant advantage for handling larger and more complex AI models.

To make these improvements accessible, AMD is rolling out a pair of new dashboards. The Resource Manager is designed for managing large clusters of GPUs, while the AI Workbench streamlines training or fine-tuning popular foundation models.

ROCm 7.0 adds native support for PyTorch 2.7 and 2.9, TensorFlow 2.19.1, and JAX 0.6. Enabling the feature in these engines is as simple as installing dependencies and setting environment variables. AITER and the MXFP4 datatype have been merged into popular inference serving engines like vLLM and SGLang.

Moreover, ROCm 7.0 extends broader support for these low precision datatypes, with AMD's Quark quantization framework now being production ready. The platform is available for download from AMD's support site and in pre-baked container images on Docker Hub.

Lastly, it's worth noting that the MI350 series is AMD's first generation of GPUs to offer hardware acceleration for OCP's microscaling datatypes. The MI355X's main competitor is Nvidia's B300, which packs 288 GB of HBM3e.

In conclusion, AMD's ROCm 7.0 software platform offers significant improvements in AI performance, making it an attractive option for developers and researchers working on AI projects.

Latest

This is the picture of a place where we have some buildings to which there are some windows, green...

Science

UK Launches Nature Towns and Cities Mission for Greener Urban Spaces

The Nature Towns and Cities mission is transforming UK urban landscapes. With significant investment, it's creating greener, healthier spaces for people to live and work in.

, and Administrator

2025 October 9

In the image there are shoe ad posters on the wall.

Fashion-and-beauty

Adidas x Arte Antwerp Launch Lightblaze POD Sneaker Honoring African Diaspora

Discover the Lightblaze POD, a sneaker that pays tribute to unsung heroes. The first release in a long-term Adidas x Arte collaboration is here.

, and Administrator

2025 October 9

In this image I can see few perfumes and a box.

Science

Chanel's Fragrance Magic: 35-Year Partnership Ensures Quality in Grasse

Discover the 35-year partnership behind Chanel's legendary fragrances. From the fields of Grasse to the iconic scents of Paris, learn about the dedicated team and exclusive plants that make Chanel's perfumes truly unique.

, and Administrator

2025 October 9

AMD Seeks to Outpace NVIDIA's CUDA with Performance-Enhancing ROCm 7 Software

AMD Seeks to Outpace NVIDIA's CUDA with Performance-Enhancing ROCm 7 Software

Read also:

Related

Latest