8 Comments
User's avatar
c3dtops's avatar
1dEdited

I thought GB10 was using MediaTek's (Taiwan) micro-architecture for the CPU?

So Nvidia is paying ARM Holdings both the ARM Architectural License fees and also micro-architecuture design fees?

https://www.mediatek.com/press-room/newly-launched-nvidia-dgx-spark-features-gb10-superchip-co-designed-by-mediatek

The GB10 Grace Blackwell Superchip leverages MediaTek’s experience in designing power-efficient and high-performance CPU, memory subsystem, and high-speed interfaces to power the Grace 20-core Arm CPU. Combined with the latest generation Blackwell GPU and 128GB of unified memory, GB10 delivers up to 1 PFLOP of AI performance to accelerate model tuning and real-time inferencing.

Schrödinger's Cat's avatar

> I thought GB10 was using MediaTek's micro-architecture for the CPU?

I'm not aware of MediaTek ever designing their own CPU or GPU microarchitecture. I'm nearly certain they've always licensed that IP from others.

I presume the way this particular partnership might've got going was MediaTek effectively trying to license the GPU IP from Nvidia. The collaboration came to light a couple years after Samsung licensed RDNA from AMD. So, maybe MediaTek worried that it needed to counter with an iGPU a bit more powerful than it could get from ARM or Imagination Technologies.

c3dtops's avatar

I was under the impression that MediaTek SOC on smartphone (those china OEM phones) with Dimensity cores were using their own micro-architecture?

https://www.mediatek.com/products/smartphones/dimensity-5g

So in a way that strategy from MediaTek is/was slowly enroaching onto the traditional stronghold of Qualcom mobile SOC business.

Maybe i'm wrong then. Tyvm for sharing

Schrödinger's Cat's avatar

>I was under the impression that MediaTek SOC on smartphone with Dimensity cores were using their own micro-architecture?

No, I don't recall any of them using in-house cores. The better phone sites, like gsmarena, tend to have pretty good coverage of the SoCs, as well. Just search of a SoC on there, and you'll probably find a lot more details than whatever press release the manufacturer puts out about it. https://www.gsmarena.com/mediatek_announces_dimensity_9500_flagship_chipset-news-69618.php

NotebookCheck is probably another good resource. https://www.notebookcheck.net/MediaTek-Dimensity-9500-Processor-Benchmarks-and-Specs.957550.0.html

With the Qualcomm SoCs, they even tend to say which cores the SoCs actually have. Before Oryon, Qualcomm tried to obscure which IP they licensed by calling everything Kryo, but the better sites would tell you which ARM IP cores they really were. Qualcomm did have their own in-house cores, but from like 2016 until the last couple years, all of their SoC's used IP cores licensed from ARM.

Almost none of the phone SoC makers design their own cores. Right now, I think it's just Apple, Qualcomm, and HiSilicon/Huawei. Samsung used to be in that club, but they ended their in-house ARM core design efforts about 7 years ago.

Schrödinger's Cat's avatar

"That said, getting a high performance core is only one piece of the puzzle. Gaming workloads are very important in the consumer space, and benefit more from a strong memory subsystem than high core throughput."

Yes, and you compared a system with LPDDR5X against two desktop CPUs with regular DDR5 memory. LPDDR has an extra latency penalty, compared to regular DDR memory, because it must multiplex address and data over the same pins. This makes the GB10's rate-1 performance even more impressive, because it's paying the LPDDR latency penalty without getting any real benefits from the 256-bit data path.

Schrödinger's Cat's avatar

"554.roms is the worst offender, and makes X925 execute more than twice as many instructions compared to Zen 5."

It would be interesting to know how heavily that test is utilizing SVE2 and AVX. It seems like the main reason it needs so many more instructions could be simply due to its narrower vector width.

It would be interesting to perform the same measurements on a Graviton 3 CPU, which has Neverse V1 cores with a 256-bit wide SVE implementation. Maybe ARM just can't keep trying to get by with 128-bit vectors, at a time when even Intel is going back to 512-bit.

Schrödinger's Cat's avatar

BTW, this article's SPEC2017int rate-1 scores mostly align with David Huang's, but the 9800X3D is somewhat of an outlier. He gets 13.8, whereas this article claims just 10.8.

https://blog.hjc.im/spec-cpu-2017

However, this article's score of ~11.6 for the 9900X is better aligned with his score of 12.6 for the 9950X. If you scale 11.6 by the relative clock speed differential, the expected score of the 9950X would be 11.8, which is only about 6.3% below Huang's. So, that makes the case of the 9800X3D rather odd.

Obviously, given different OS, RAM, and compiler versions, some differences in score are to be expected. What I found surprising was that the Zen 5 CPUs didn't all scale somewhat proportionately between the two sets of benchmarks.

FWIW, he benchmarked the GB10's X925 at 12, which is much closer to this article's score of ~11.8.

Schrödinger's Cat's avatar

"Matching the best from Intel and AMD must have been a distant dream in 2012, when Arm launched their first 64-bit core, the Cortex A57."

I think it's slightly generous to credit ARM for launching it in 2012. As far as I can tell, the first SoCs using it didn't ship until 2014.