3 Comments
User's avatar
Eric Olson's avatar

While the Ryzen and 9355P provide nice context, I don't understand why there's no comparison between NSP0, 1, 2, 3, 4 and L3 as NUMA domain all on the same EPYC 9575F.

Interesting none the less and I expect the B200 results will be even more so.

Expand full comment
Freddie Cash's avatar

Sounds like they were given access to a VM running on that hardware, not access to the physical hardware. Hard to test what you don't have access to. :)

Expand full comment
Neural Foundry's avatar

The 220ns latency hit compared to NPS1 mode is brutal, but I was suprised how wel it still performs in single-threaded SPEC runs. The caching setup must be doing some serious work to offset that penalty. Would love to see comparisons between NPS2/4 modes to understand the latency tradeofs better?

Expand full comment