10 Comments
User's avatar
Farfolomew's avatar

Qualcomm calling their efficiency cores "Performance" cores, while also coming in at only six vs the "Prime" core count of 12 is not confusing at all!

Qualcomm "Performance" cores = Every other chipmaker's efficiency cores

Qualcomm "Prime" cores = Every other chipmaker's performance cores.

Also, there's spelling mistakes in the Qualcomm slides. Very professional.

Expand full comment
J. G. W. Livingston's avatar

How you guys comprehend this stuff amazes me.

Expand full comment
Neural Foundry's avatar

The shift to 18 cores with that hybrid approach is fasinating, especially with the Performance cluster optimized for sub 2W operation. Qualcomm seems to be lerning from the mobile space where heterogenous architectures really shine. The 80 TOPS NPU with FP8 support could be a gam changer if developers actually tap into it. Battery life on laptops could finally match what we see on ARM tablets.

Expand full comment
Peter W.'s avatar

The problem with the Hexagon NPU is that things like the functions of the Tensor cores in it aren't exactly well documented, making it difficult to program for them.

Expand full comment
butterflies's avatar

The Hexagon Kernel Library which is pretty new https://softwarecenter.qualcomm.com/catalog/item/Hexagon_KL solves part of the issue on that side (and qhl_hmx also existed before).

Not a full replacement to actually documenting HMX though

Expand full comment
Peter W.'s avatar

Thanks for replying, and yes, a lot better than -404-. I still don't understand why it's taken Qualcomm that long to realize that there are crucial intermediate steps to " if you build it, they will come": "they" still have to know where "it" is, and how to get there.

Expand full comment
Peter W.'s avatar

Firstly, thanks George! Maybe Snapdragon will take off this time! I actually liked that Qualcomm was more realistic in its comparisons, like the ones in the graph on power draws.

Maybe I missed it in your write-up and the slides, but I wonder how many PCI-E lanes the SoCs have and if Laptops with the new SDs could at least theoretically support a dGPU? I believe that lack of possible expansion of the capabilities of laptops with the SD Elite SoCs was an issue for some potential customers.

Did Qualcomm have any news on better x86 emulation for programs that aren't available for Windows-on-ARM, and about productivity software that now is? It would really help with uptake in the market.

I also believe that Qualcomm would do well to also cater a bit more to the Linux community, including working with their OEM partners on Snapdragon laptops with Linux and drivers preinstalled. Yes, that's still "niche", but it's a growing one.

Lastly, that beefy NPU needs one or two "killer apps" that really require a powerful NPU to be a differentiator. The effect of that absence is not restricted to QC's Snapdragons, any of the "AI" enabled SoCs (including Ryzen and Lunar Lake) are also suffering from that. As long as a regular CPU and iGPU can run any software I have or need without doing that so much better if it had a 40+ TOPS NPU makes for a poor value proposition by large NPUs.

One reason why those aren't out there yet might be the difficulty developing for the Hexagon NPU in Snapdragon Elites. @Babbage here on Substack had posted a good write-up related to that, including the rather sparse (read: almost absent) documentation of Hexagon's tensor core functions in QC's programming manuals.

Expand full comment
Schrödinger's Cat's avatar

Definitely needs better Linux support. Got Ubuntu 25.10 working on a 1st gen laptop, but it's not as simple as just downloading the ARM64 ISO and installing it, like you'd do for a x86 machine. That's unfortunate, as there are some great deals to be found on several gen 1 Snapdragon X Plus laptop models.

Expand full comment
Blake Pelton's avatar

A 2560x1440 (4 bytes per pixel) render target can fit into 21 MB of AHPM (wow). Should it still be called "tiled rendering" if there is only 1 tile?

Expand full comment
Schrödinger's Cat's avatar

> A 2560x1440 (4 bytes per pixel) render target can fit into 21 MB of AHPM

Don't overlook that it's divided up between the slices. Within each slice, it's partitioned between cache and local memory. You also need to account for Z-buffer and stencil buffer. Not to mention shaders' other cache & local memory needs.

> Should it still be called "tiled rendering" if there is only 1 tile?

Load balancing would be poor, if you statically partition a frame across the slices (which would technically be 4 tiles, already). Also, the renderer needs to scale up to higher resolutions and down to fewer slices.

Expand full comment