Discussion about this post

User's avatar
Schrödinger's Cat's avatar

"554.roms is the worst offender, and makes X925 execute more than twice as many instructions compared to Zen 5."

It would be interesting to know how heavily that test is utilizing SVE2 and AVX. It seems like the main reason it needs so many more instructions could be simply due to its narrower vector width.

I'd be curious if the same test would report a different instruction rate on a Graviton 3 CPU, which has Neverse V1 cores with a 256-bit wide SVE implementation. Maybe ARM just can't keep trying to get by with 128-bit vectors, at a time when even Intel is going back to 512-bit.

c3dtops's avatar
5dEdited

I thought GB10 was using MediaTek's (Taiwan) micro-architecture for the CPU?

So Nvidia is paying ARM Holdings both the ARM Architectural License fees and also micro-architecuture design fees?

https://www.mediatek.com/press-room/newly-launched-nvidia-dgx-spark-features-gb10-superchip-co-designed-by-mediatek

The GB10 Grace Blackwell Superchip leverages MediaTek’s experience in designing power-efficient and high-performance CPU, memory subsystem, and high-speed interfaces to power the Grace 20-core Arm CPU. Combined with the latest generation Blackwell GPU and 128GB of unified memory, GB10 delivers up to 1 PFLOP of AI performance to accelerate model tuning and real-time inferencing.

12 more comments...

No posts

Ready for more?