<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Chips and Cheese]]></title><description><![CDATA[The Devil is in the Details! Deep dives into computer hardware and software and the wider industry...]]></description><link>https://chipsandcheese.com</link><image><url>https://substackcdn.com/image/fetch/$s_!2heg!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579f5b72-95e6-43b5-b330-3ddb303bed9a_1280x1280.png</url><title>Chips and Cheese</title><link>https://chipsandcheese.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 05 Apr 2026 09:40:31 GMT</lastBuildDate><atom:link href="https://chipsandcheese.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Chips and Cheese]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[articles@chipsandcheese.com]]></webMaster><itunes:owner><itunes:email><![CDATA[articles@chipsandcheese.com]]></itunes:email><itunes:name><![CDATA[Chips and Cheese]]></itunes:name></itunes:owner><itunes:author><![CDATA[Chips and Cheese]]></itunes:author><googleplay:owner><![CDATA[articles@chipsandcheese.com]]></googleplay:owner><googleplay:email><![CDATA[articles@chipsandcheese.com]]></googleplay:email><googleplay:author><![CDATA[Chips and Cheese]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Embracing AI with Claude's C Compiler]]></title><description><![CDATA[Down with the old! In with the new!]]></description><link>https://chipsandcheese.com/p/embracing-ai-with-claudes-c-compiler</link><guid isPermaLink="false">https://chipsandcheese.com/p/embracing-ai-with-claudes-c-compiler</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Wed, 01 Apr 2026 07:02:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hMsM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Update (4-2-2026)</strong>: After careful consideration of changing conditions, especially the change in current date, the transition to Claude&#8217;s C Compiler has been rescinded. The original article remains below.</p><p>Our society has gone all-in on AI. Software and related technologies represent one of the biggest sectors of the US economy, and technology companies have all but bet the farm on LLMs being able to do everything under the sun. By the law of sunk costs*, AI investments will increase without bound until that objective is achieved. AGI is therefore inevitable, and everyone must fully commit to ensuring AI&#8217;s success because things will get more painful the longer it takes. When bank money runs out, the government will bail them out. When government funds are exhausted, aliens will get involved, followed by various deities. Please don&#8217;t ask what happens after that. Anyway, it&#8217;s in everyone&#8217;s best interest to embrace the AI revolution with absolute conviction. Therefore, I&#8217;m excited to announce that Chips and Cheese will stand at the vanguard of the AI revolution by adopting Claude&#8217;s AI-generated compiler, CCC.</p><p>CCC, or Claudes&#8217; C Compiler, is a &#8220;from-scratch optimizing compiler with no dependencies [&#8230;] able to compile the Linux kernel.&#8221; Anthropic used their Claude large language model (LLM) to write the entire compiler. Human effort was relegated to prompting and maintaining AI coding agents, and standing up a very robust test suite to keep those coding agents on track. Past Chips and Cheese articles used microbenchmarks and benchmarks built using GCC. GCC was developed by humans, which is backwards and reactionary by the standards of many cadres today. Comparing GCC and CCC should be a powerful showcase of revolutionary software development practices.</p><h1>Compiling a Microbenchmark</h1><p>Microbenchmarking is a useful tool for understanding low level hardware characteristics like cache and memory latency. My first shot at testing load latency used a simple loop of dependent array accesses in C:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z_05!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z_05!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 424w, https://substackcdn.com/image/fetch/$s_!z_05!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 848w, https://substackcdn.com/image/fetch/$s_!z_05!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 1272w, https://substackcdn.com/image/fetch/$s_!z_05!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z_05!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png" width="372" height="61" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e938811a-c424-4471-9670-0ced8fd6561f_372x61.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:61,&quot;width&quot;:372,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2554,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/192605552?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z_05!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 424w, https://substackcdn.com/image/fetch/$s_!z_05!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 848w, https://substackcdn.com/image/fetch/$s_!z_05!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 1272w, https://substackcdn.com/image/fetch/$s_!z_05!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe938811a-c424-4471-9670-0ced8fd6561f_372x61.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p><code>A = A[current] </code>creates a dependency chain where one array access result feeds into the next, preventing accesses from overlapping even on an out-of-order core. Dividing access time by iteration count gives latency. CCC and GCC compile the loop into the following assembly code:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qyd-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qyd-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 424w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 848w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 1272w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qyd-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png" width="601" height="653" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:653,&quot;width&quot;:601,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qyd-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 424w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 848w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 1272w, https://substackcdn.com/image/fetch/$s_!qyd-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbe114ea-29d9-4ff7-8f5a-58e352b9f068_601x653.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>CCC generates more code from the same high level source than GCC does, and sometimes, more is better. CCC&#8217;s code is more ideologically correct from a RISC standpoint. GCC uses indexed addressing modes that express a dependent shift, add, and load sequence in a single instruction. RISC advocates for a small set of simple instructions and larger code sizes, and indexed addressing modes are quite complex. CCC follows the RISC philosophy, and splits the array access into separate shift, add, and load instructions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Y7y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Y7y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 424w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 848w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 1272w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Y7y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png" width="908" height="305" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:305,&quot;width&quot;:908,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Y7y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 424w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 848w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 1272w, https://substackcdn.com/image/fetch/$s_!4Y7y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5f9403-4cb6-41b2-85a1-96a125a33c2d_908x305.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">x86-64. Maximum optimization enabled for both compilers</figcaption></figure></div><p>CCC doesn&#8217;t stop there and exceeds quotas by decomposing <code>current = A[current] </code>into a nine instruction dependency chain. On x86-64, it shuffles the index through several registers and sends it on a trip through the stack. It does the same on aarch64, but at least keeps the index in registers. On both architectures, CCC brings the array base address in from the stack, and sends it on another round trip through the stack for good measure before using it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3MgX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3MgX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 424w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 848w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 1272w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3MgX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png" width="903" height="282" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:282,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3MgX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 424w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 848w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 1272w, https://substackcdn.com/image/fetch/$s_!3MgX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07fdf11e-601f-4b04-971c-d318eefc5ba1_903x282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">aarch64. Maximum optimization enabled for both compilers</figcaption></figure></div><p>Compiling the microbenchmark with CCC results in higher observed array access latency. Zen 5 and Lion Cove need two extra cycles, likely from the shift + add dependency chain. Exceptionally strong move elimination and zero-latency store forwarding seem to let Intel and AMD collapse the nine-instruction dependency chain. After that, their out-of-order core width absorbs the additional instructions. Arm&#8217;s recent cores have less robust move elimination and can&#8217;t do zero latency store forwarding. They suffer a 6-7 cycle penalty.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PLNT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PLNT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 424w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 848w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 1272w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PLNT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png" width="1175" height="634" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:634,&quot;width&quot;:1175,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PLNT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 424w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 848w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 1272w, https://substackcdn.com/image/fetch/$s_!PLNT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc24ad24b-e45f-4a91-bdb7-ea56cb532a5b_1175x634.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Smaller cores with narrower execution engines struggle with CCC-compiled code, likely because they don&#8217;t have enough core throughput to absorb loop overhead and the sum += current sink statement. CCC also compiles those into very long assembly sequences, placing pressure on core throughput.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dCoI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dCoI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 424w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 848w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 1272w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dCoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png" width="800" height="264" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:264,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dCoI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 424w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 848w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 1272w, https://substackcdn.com/image/fetch/$s_!dCoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bbf1844-87c5-4c3e-a21a-05c28b04daec_800x264.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Loop overhead on aarch64 when compiled with CCC. Would a human ever think of MOV-ing a register to itself? See, AI can do what humans can&#8217;t.</figcaption></figure></div><p>In-order cores suffer the largest penalties. They&#8217;re not embracing the AI revolution with adequate enthusiasm, and should seriously reconsider their stance or risk getting denounced.</p><h1>Challenging the Old Type Based Order</h1><p>Old, traditional compilers force the masses to follow a tight set of rules dreamt up by the bourgeoisie. If a programmer doesn&#8217;t follow these rules, the compiler will complain and may even refuse to complete the compilation process. Warnings and errors are of no use to the proletariat, who just want results. CCC boldly crushes type systems with revolutionary fervor, as if to shout &#8220;out with the old, in with the new!&#8221; Take the following code:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uvq4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uvq4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 424w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 848w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 1272w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uvq4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png" width="306" height="239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:239,&quot;width&quot;:306,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uvq4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 424w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 848w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 1272w, https://substackcdn.com/image/fetch/$s_!uvq4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b99d3ac-63d9-4b77-b5b3-34d5906e1028_306x239.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>GCC cowers behind error messages with weak excuses about incompatible types and fails to produce a binary. In doing so, GCC engages in counter-revolutionary wrecking by undermining the work of the masses, who have to waste time fixing errors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B8mo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B8mo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 424w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 848w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 1272w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B8mo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png" width="777" height="312" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/518b0479-6853-41f2-b198-298d3105d41d_777x312.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:312,&quot;width&quot;:777,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B8mo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 424w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 848w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 1272w, https://substackcdn.com/image/fetch/$s_!B8mo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518b0479-6853-41f2-b198-298d3105d41d_777x312.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Pay no attention to the longer command line for ccc-arm. He who complains about needing to explicitly add include paths will be reeducated.</figcaption></figure></div><p>Rather than make excuses, CCC raises the revolutionary banner high and produces an executable.</p><h1>Performance: SPEC CPU2017</h1><p>Claude&#8217;s C Compiler only compiles C code. All other languages are therefore invalid. Eight of SPEC CPU2017&#8217;s workloads are C-only. I tested them on the following cores:</p><ul><li><p>Arm Cortex X925 in Nvidia&#8217;s GB10</p></li><li><p>AMD Zen 5 in the Ryzen 7 9800X3D, with boost disabled</p></li><li><p>Intel Lion Cove in the Core Ultra 9 285K</p></li></ul><p>I ran with boost disabled on the 9800X3D because running CCC-built workloads takes a lot of time. Extra time spent in high performance states mean more power consumption, and we should save every last bit of power possible to feed AI applications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dazI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dazI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 424w, https://substackcdn.com/image/fetch/$s_!dazI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 848w, https://substackcdn.com/image/fetch/$s_!dazI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 1272w, https://substackcdn.com/image/fetch/$s_!dazI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dazI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png" width="1332" height="1098" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1098,&quot;width&quot;:1332,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dazI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 424w, https://substackcdn.com/image/fetch/$s_!dazI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 848w, https://substackcdn.com/image/fetch/$s_!dazI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 1272w, https://substackcdn.com/image/fetch/$s_!dazI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcd6f919-aeda-4c83-a94a-2fd33c012c37_1332x1098.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>CCC&#8217;s version of 502.gcc consistently crashed with a segmentation fault on Cortex X925, even though it managed to complete some runs on x86-64. When it ran to completion on x86-64, CCC&#8217;s 502.gcc delivered 23.6% and 27.1% of the performance of its GCC-compiled counterpart on Lion Cove and Zen 5, respectively. CCC hit its relative performance peak on 505.mcf, where it managed less than a 35% regression versus GCC. Across the eight tested SPEC CPU2017 workloads, using CCC on average regressed performance by somewhere above 70%.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fMKF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fMKF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 424w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 848w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 1272w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fMKF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png" width="1240" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1240,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fMKF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 424w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 848w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 1272w, https://substackcdn.com/image/fetch/$s_!fMKF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25049ddb-39c0-4dd6-8da4-be55c59e3202_1240x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Arm&#8217;s Cortex X925 suffers hardest from the CCC switch. With GCC, Arm&#8217;s core was very much a match for Zen 5 and Lion Cove. Now, it loses in every workload except for 500.perlbench, where Zen 5 takes a dive to let X925 slip ahead. But X925 still can&#8217;t compare to Lion Cove in that workload.</p><p>In turn, Lion Cove can&#8217;t compare to Sun&#8217;s UltraSPARC IV+. SPEC CPU2017 scores are speedup ratios relative to a reference machine, which is &#8220;a historical Sun Microsystems server, the Sun Fire V490 with 2100 MHz UltraSPARC-IV+&#8221;. Lion Cove may have the best 500.perlbench score of the three cores tested here at 0.76, but that&#8217;s below 1. A vibe-coded compiler is the product of the latest software development techniques and can&#8217;t be at fault. It therefore follows that Sun&#8217;s UltraSPARC-IV+ is amazing. The UltraSPARC-IV+ is a dual core version of the UltraSPARC III, which uses a 4-wide in-order pipeline. Lion Cove is an 8-wide out-of-order design with massive reordering capacity and much higher clock speeds, but clearly none of that is necessary to win.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!52zC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!52zC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 424w, https://substackcdn.com/image/fetch/$s_!52zC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 848w, https://substackcdn.com/image/fetch/$s_!52zC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 1272w, https://substackcdn.com/image/fetch/$s_!52zC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!52zC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png" width="1456" height="1064" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1064,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!52zC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 424w, https://substackcdn.com/image/fetch/$s_!52zC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 848w, https://substackcdn.com/image/fetch/$s_!52zC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 1272w, https://substackcdn.com/image/fetch/$s_!52zC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d569e72-4f9f-4069-9588-accc8f1c5bcf_1515x1107.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Performance is overrated. Analysts today agree that IPC is more important than performance, and CCC code achieves excellent IPC. On 525.x264, Arrow Lake averages 6.09 IPC when running CCC code. That&#8217;s the highest IPC average across all SPEC CPU2017 runs above. GCC&#8217;s best IPC average tops out at a wimpy 5.5. CCC&#8217;s IPC advantage isn&#8217;t universal though, and GCC does manage higher IPC on 538.imagick.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LaN-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LaN-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 424w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 848w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 1272w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LaN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png" width="1363" height="740" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:740,&quot;width&quot;:1363,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LaN-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 424w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 848w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 1272w, https://substackcdn.com/image/fetch/$s_!LaN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e27327c-6851-42fe-ae3e-f0e857a4006b_1363x740.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Instructions for CCC-compiled 502.gcc may be undercounted on aarch64, because the test didn&#8217;t complete.</figcaption></figure></div><p>As foreshadowed with the cache and memory latency microbenchmark, CCC takes a great leap forward in instruction counts across all eight SPEC CPU2017 workloads. Some workloads execute more than 10x as many instructions.</p><h1>A Top-Down View</h1><p>Hardware performance monitoring reveals a different set of challenges for the CPU pipeline when comparing GCC- and CCC-compiled code. Branch prediction is less important because branch count didn&#8217;t grow proportionately to total instruction count. SPEC CPU2017&#8217;s three C-only floating point workloads become very core-bound when built with CCC. On Zen 5, that means retirement is blocked by an instruction that doesn&#8217;t read from memory. That indicates performance is limited by how fast the core can execute instructions, rather than how fast the memory subsystem can feed the core with data. The five integer workloads become less frontend bound, and face steeper backend challenges. CCC workloads across the board achieve higher core utilization than their GCC-compiled counterparts, as the IPC figures from before suggest.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_LZa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_LZa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 424w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 848w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_LZa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png" width="1456" height="1148" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1148,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_LZa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 424w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 848w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 1272w, https://substackcdn.com/image/fetch/$s_!_LZa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F504b9a1b-bf92-4abe-80de-1b3f4e56d237_1600x1262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>500.perlbench is a catastrophically bad case for Zen 5 when compiled with CCC because of a much lower op cache hitrate. Zen 5 primarily feeds itself with a 6K entry, 16-way op cache, which can fully feed the 8-wide rename/allocate stage. 4-wide, per-thread decoders handle larger instruction footprints. Typically applications with poor code locality suffer from other IPC limiters besides frontend throughput, especially when the core isn&#8217;t running a second thread that can hide latency. 500.perlbench has higher IPC potential if not for the per-thread decode throughput limitation, putting Zen 5 behind X925 and very far behind Lion Cove. It&#8217;s the first case I&#8217;ve seen outside a microbenchmark where Zen 5&#8217;s decoder arrangement comes out to bite. Zen 5&#8217;s op cache manages high coverage in other workloads, so 500.perlbench is an exceptional case.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KmW9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KmW9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 424w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 848w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 1272w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KmW9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png" width="1062" height="656" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:656,&quot;width&quot;:1062,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KmW9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 424w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 848w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 1272w, https://substackcdn.com/image/fetch/$s_!KmW9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b5a3caa-1634-4a15-a629-8b86ba62a6b5_1062x656.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>CCC challenges Lion Cove&#8217;s pipeline in a similar fashion, though Lion Cove is generally more backend bound than Zen 5. This could be because Lion Cove sits atop a higher latency uncore. Or, the difference could be down to accounting methodology. For example, Intel might be counting cycles where any instruction is delayed by a load as &#8220;memory bound&#8221;, which could lead to higher counts than AMD&#8217;s approach. Lion Cove&#8217;s 8-wide decoder excels in 500.perlbench, where frontend limitations are minor compared to what they were in Zen 5.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ofi7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ofi7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 424w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 848w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 1272w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ofi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png" width="1456" height="1218" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1218,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ofi7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 424w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 848w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 1272w, https://substackcdn.com/image/fetch/$s_!Ofi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ca94d31-2163-42f8-8405-31d2db673c66_1600x1339.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Arm&#8217;s Cortex X925 is very core-bound when running CCC code, and not just in the three FP workloads. I&#8217;m distinguishing between core and memory bound backend bound slots by looking at the proportion of backend bound cycles covered by the STALL_BACKEND_MEMBOUND event, which counts when there&#8217;s a backend stall and the &#8220;backend interface to memory is busy or stalled&#8221;.</p><p>I partially attribute the core-bound slot increase to Cortex X925 not having the robust move elimination and zero-latency store forwarding that Lion Cove and Zen 5 do. Long chains of register-to-register MOVs or frequent trips through the stack are rare in typical code. As evidence, Intel's Ice Lake only<a href="https://www.realworldtech.com/forum/?threadid=200635&amp;curpostid=200798"> took a 1-3% performance loss</a> with move elimination disabled. Move elimination therefore acts as a cherry on top in most applications, with most performance coming from out-of-order execution basics and an efficient cache hierarchy to feed the core. Of course, that changes with CCC.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5a54!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5a54!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 424w, https://substackcdn.com/image/fetch/$s_!5a54!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 848w, https://substackcdn.com/image/fetch/$s_!5a54!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 1272w, https://substackcdn.com/image/fetch/$s_!5a54!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5a54!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png" width="1456" height="1157" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1157,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5a54!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 424w, https://substackcdn.com/image/fetch/$s_!5a54!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 848w, https://substackcdn.com/image/fetch/$s_!5a54!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 1272w, https://substackcdn.com/image/fetch/$s_!5a54!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9949c1f-be69-41f3-91fe-63cdf200efc2_1600x1271.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>519.lbm&#8217;s large increase in backend memory bound slots looks related to store forwarding latency. A hot section in CCC-compiled 519.lbm repeatedly spills register x0 to the stack, only to reload it several instructions later, making sure the memory system contributes to all phases of code execution. Store forwarding latency forms part of a long dependency chain, and X925 can&#8217;t do zero-latency forwarding the way Zen 5 and Lion Cove can.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hMsM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hMsM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 424w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 848w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 1272w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hMsM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png" width="967" height="1173" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1173,&quot;width&quot;:967,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hMsM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 424w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 848w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 1272w, https://substackcdn.com/image/fetch/$s_!hMsM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc02b8aec-dda2-40d6-a3f0-64b48f03584b_967x1173.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">519.lbm code generated from CCC, broken into with GDB</figcaption></figure></div><h1>Final Words</h1><p>Compilers are crucial to modern software development, and performance of compiled code <a href="https://compilers.iecc.com/comparch/article/97-10-017">is key to making high level languages practical</a>. Switching to CCC causes a 70%+ performance degradation across C-only SPEC CPU2017 workloads. Most consumers will readily accept this, because a fast core like Intel&#8217;s Lion Cove still does well when running CCC code. As evidence, AMD&#8217;s Bulldozer architecture fascinated enthusiasts about 15 years ago. Bulldozer is 36.8% faster than Lion Cove, the fastest CPU tested here for running CCC code. Simply overclocking Lion Cove to 8 GHz could close much of the performance gap.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y2nW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y2nW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 424w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 848w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 1272w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y2nW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png" width="1101" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1101,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y2nW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 424w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 848w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 1272w, https://substackcdn.com/image/fetch/$s_!y2nW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5789f40f-c509-48bc-8ef1-47768356a302_1101x672.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Of course, some provocateurs may <a href="https://valid.x86.fr/c5a5te">only reach 7.5 GHz</a>, undoubtedly due to a combination of personal skill issues and insufficient enthusiasm for the AI revolution. Severe agitators may theorize that Bulldozer&#8217;s level of single threaded performance is not top-notch in some scenarios, and demand ever more performance. To prevent these people from discrediting the revolution, the stock performance problem must be addressed.</p><p>Performance problems can be addressed from either hardware or software, because hardware and software optimization go hand-in-hand. Well optimized software performs well across a wide variety of hardware. Likewise, well designed hardware delivers good performance across a wide range of software. CCC-generated code dramatically shifts what software demands from hardware, upsetting the existing social contract between software and hardware. Faulting the software side would undermine the crowning achievements of the AI revolution. Therefore, hardware must boldly tackle the performance question.</p><p>Completely addressing the performance gap would be easy by my estimate. With 20-30 GHz clock speeds and several generations worth of architectural improvements, CCC code could reach the performance achieved by GCC-compiled binaries on current generation hardware. These targets might have been dismissed in the past. But they&#8217;re achievable now because the AI revolution isn&#8217;t just any revolution. Rather, it is a cultural revolution that upends views regarding power consumption. Infinite power can make many previously unthinkable designs a reality. Some hardware designers may see red flags in this approach and persistently complain. But proper hardware engineers will rally around those flags and chant slogans. By correctly following the path of the revolution, they&#8217;ll be sure to succeed and may even exceed their quotas. May the odds be ever in their favor.</p><h1>Footnotes</h1><p>* The proof is trivial and similar to the proof for the gambler&#8217;s law (99% of gamblers quit just before they win big)</p>]]></content:encoded></item><item><title><![CDATA[Microbenchmarking Chipsets for Giggles]]></title><description><![CDATA[Who cares, it's fun]]></description><link>https://chipsandcheese.com/p/microbenchmarking-chipsets-for-giggles</link><guid isPermaLink="false">https://chipsandcheese.com/p/microbenchmarking-chipsets-for-giggles</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Sun, 22 Mar 2026 05:46:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KrbO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Motherboard chipsets have taken on fewer and fewer performance critical functions over time. In the Athlon 64 era, integrated memory controllers took the chipset out of the DRAM access path. In the early 2010s, CPUs like Intel&#8217;s Sandy Bridge brought PCIe lanes onto the CPU, letting it interface with a GPU without chipset involvement. Chipsets today still host a lot of important IO capabilities, but they&#8217;ve become a footnote in the performance equation. Benchmarking a chipset won&#8217;t be useful, but a topic doesn&#8217;t have to be useful in order to be fun. And I do think it&#8217;ll be fun to test chipset latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KrbO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KrbO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KrbO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KrbO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KrbO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5b2e24-e74a-4ac6-9a5c-6d1a59028c1c_1600x1065.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Heatsink over the SB950 southbridge on the ASUS M5A99X EVO R2.0 board</figcaption></figure></div><h1>Testing the Chipset</h1><p>I&#8217;m using Nemes&#8217;s Vulkan-based GPU benchmark suite here, but with some modifications that let me allocate memory with the <code>VK_MEMORY_PROPERTY_HOST_COHERENT_BIT </code>set, and the <code>VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT </code>unset. That gives me a buffer allocated out of host memory. Then, I can run the Vulkan latency test against host memory, giving me GPU memory access latency over PCIe. The GPU in play doesn&#8217;t matter here, because I&#8217;m concerned with latency differences between using CPU PCIe slots and the southbridge ones. I&#8217;m using Nvidia&#8217;s T1000 because it&#8217;s a single slot card and can fit into the bottom PCIe slot in my various systems. All of the systems below run Windows 10, except for the FX-8350 system, which I have running Windows Server 2016. I&#8217;m using the latest drivers available for those operating systems, which would be 553.50 on Windows 10 and 475.14 on Windows Server 2016.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pHHP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pHHP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 424w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 848w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 1272w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pHHP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png" width="1456" height="551" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pHHP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 424w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 848w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 1272w, https://substackcdn.com/image/fetch/$s_!pHHP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3246e05-5816-47b7-b79c-840d801ab38d_1600x605.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>I also tried testing in the other direction by having the CPU access GPU device memory. Nemes actually does that for her Vulkan uplink latency test. However, I hit unexplainable inflection points when testing across a range of test sizes. Windows performance metrics indicated a ton of page faults at larger test sizes, perhaps hinting at driver intervention. Investigating that is a rabbit hole I don&#8217;t want to go down, and the CPU to VRAM access behavior I observed with the T1000 didn&#8217;t apply in the same way across different GPUs. I&#8217;ll therefore be focusing on GPU to host memory access latency, though I&#8217;ll leave a footnote at the end for CPU-side results.</p><h2>Cache Bandwidth and Probes</h2><p>Setting the <code>VK_MEMORY_PROPERTY_HOST_COHERENT_BIT </code>means writes can be made visible between the CPU and GPU without explicit cache flush and invalidate commands. With the T1000, this has the strange side effect of causing a lot of probe traffic even when the GPU hits its own caches. GPU cache hit bandwidth can be constrained depending on platform configuration, possibly depending on probe throughput. AMD&#8217;s old Piledriver CPU has northbridge performance events that can track probe results. These events show a flood of probes when the test goes through small sizes that fit within the T1000&#8217;s caches. Probe activity dips when the test reaches larger sizes and bandwidth is constrained by the PCIe link.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ITpg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ITpg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 424w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 848w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 1272w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ITpg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png" width="903" height="519" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c89e779-da00-4216-a7d7-6166bc655574_903x519.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:519,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ITpg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 424w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 848w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 1272w, https://substackcdn.com/image/fetch/$s_!ITpg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c89e779-da00-4216-a7d7-6166bc655574_903x519.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">By northbridge here, I mean the northbridge integrated into the CPU, not the chipset&#8217;s northbridge.</figcaption></figure></div><p>There&#8217;s a lot of mysteries here. Probes work at cacheline granularity, and AMD&#8217;s 15h BKDG makes that clear. But probe rate suggests the T1000 is initiating a probe for every 512 bytes of cache hits, not every 64. Another mystery is why cache hit latency appears unaffected.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nOp3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nOp3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 424w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 848w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 1272w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nOp3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png" width="1150" height="247" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:247,&quot;width&quot;:1150,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nOp3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 424w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 848w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 1272w, https://substackcdn.com/image/fetch/$s_!nOp3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681da4e3-0b4b-47b4-aa25-c7eab3c9efd1_1150x247.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Documentation for the probe result event from AMD&#8217;s 15h BKDG (BIOS and Kernel Developer&#8217;s Guide)</figcaption></figure></div><p>Mysteries aside, I&#8217;ll be dropping a note when GPU cache hit bandwidth appears to be affected by the platform. But I won&#8217;t comment on GPU cache miss bandwidth, which will obviously be limited by PCIe bandwidth. The T1000 has a x16 PCIe Gen 3 interface to the host, and the newest platforms I test here can support PCIe Gen 4 or Gen 5. I don&#8217;t have a single slot card with PCIe Gen 4 or Gen 5 support.</p><h1>AMD&#8217;s AM5 Platform</h1><p>AMD&#8217;s latest desktop platform is built around the PROM21 chip. PROM21 has a PCIe 4.0 x4 uplink, and exposes a variety of downstream IO including more PCIe 4.0 links. AMD&#8217;s X670E chipset chains two PROM21 chips together, basically giving the platform two southbridge chips for better connectivity. Gigabyte&#8217;s X670E Aorus Master has three PCIe slots. One connects to the CPU, while the two others connect to the lower PROM21 chip. The PROM21 chip connected to the CPU isn&#8217;t hooked up to any PCIe slots. Instead, it&#8217;s only used to provide M.2 slots and some USB ports. The B650 chipset, tested here on the ASRock B650 PG Lightning, only uses one PROM21 chip and connects it to PCIe slots.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9IZA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9IZA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 424w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 848w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 1272w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9IZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png" width="1456" height="915" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:915,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9IZA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 424w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 848w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 1272w, https://substackcdn.com/image/fetch/$s_!9IZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0fdb0f4-99ac-4520-8421-ad06ed762922_1600x1005.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Block diagrams of the two tested AM5 boards from their respective manuals, with green boxes drawn over tested PCIe slots</figcaption></figure></div><p>Using the CPU&#8217;s PCIe lanes gives about 650 ns of baseline latency. Going through one PROM21 chip on the B650 chipset brings latency to 1221 ns, or 569.7 ns over the baseline. On X670E, hopping over two PROM21 chips brings latency to 921.3 ns over the baseline. Going over the chipset therefore comes with a hefty latency penalty. Going through an extra PROM21 chip further increases latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_CSo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_CSo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 424w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 848w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 1272w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_CSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png" width="914" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:914,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_CSo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 424w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 848w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 1272w, https://substackcdn.com/image/fetch/$s_!_CSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57a2c1aa-b7dd-4243-962f-7d822dc293dc_914x657.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I saw no difference in GPU cache hit bandwidth with the T1000 connected to Zen 5&#8217;s CPU lanes. Switching the GPU to the chipset lanes dropped cache hit bandwidth to just above 25 GB/s, and that figure remains similar regardless of whether one or two PROM21 chips sat between the CPU and GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gP_q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gP_q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 424w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 848w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 1272w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gP_q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png" width="1456" height="661" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:661,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gP_q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 424w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 848w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 1272w, https://substackcdn.com/image/fetch/$s_!gP_q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5549f663-acec-48a0-b5e5-040807bc8d08_1600x726.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><h1>Intel&#8217;s Arrow Lake Platform</h1><p>Arrow Lake chipsets follow the same topology as Intel chipsets going back to Sandy Bridge. A Platform Controller Hub (PCH) serves the southbridge role. The PCH connects to the CPU via a Direct Media Interface (DMI), which is a PCIe-like interface. Z890 uses DMI Gen 4, which can use transfer speeds of up to 16 GT/s with eight lanes. I&#8217;m testing on the MSI PRO Z890-A WIFI, which provides three PCIe slots. The first is a x16 slot connected to the CPU, while the two others are x4 slots connected to the PCH.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0g5U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0g5U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 424w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 848w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 1272w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0g5U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png" width="1271" height="1428" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1428,&quot;width&quot;:1271,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0g5U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 424w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 848w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 1272w, https://substackcdn.com/image/fetch/$s_!0g5U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F448f24e9-3efa-480d-8918-1d71e85763b9_1271x1428.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Block diagram of Arrow Lake&#8217;s chipset from Intel&#8217;s Chipset brief document</figcaption></figure></div><p>Baseline latency on Arrow Lake is higher than on Zen 5, with the GPU seeing 785 ns of load-to-use latency from host memory. Going through the PCH adds approximately 550 ns of latency, which is in line with what one PROM21 chip adds on the AM5 platform.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KbyZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KbyZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 424w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 848w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 1272w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KbyZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png" width="1456" height="634" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:634,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KbyZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 424w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 848w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 1272w, https://substackcdn.com/image/fetch/$s_!KbyZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22d66306-c9b8-4ca6-a79e-0d82c0e90191_1575x686.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>I&#8217;m going to skip commenting about the T1000&#8217;s cache hit bandwidth on Arrow Lake. It does not behave as expected, even when attached to the CPU&#8217;s PCIe lanes. The T1000&#8217;s cache hit behavior also differs from that of other GPUs. For example, Intel&#8217;s Arc B580 maintains full cache hit bandwidth when attached to the Arrow Lake platform, though curiously it can&#8217;t cache host memory in its L2.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8bEy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8bEy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 424w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 848w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 1272w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8bEy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png" width="1456" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8bEy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 424w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 848w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 1272w, https://substackcdn.com/image/fetch/$s_!8bEy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f3192f3-a3b4-499e-b8ce-d4f04d21eb7f_1600x683.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><h1>Intel&#8217;s Skylake Platform</h1><p>Skylake&#8217;s Z170 chipset uses a familiar southbridge-only topology like Z890, but with less connectivity and lower bandwidth. I&#8217;m testing on the MSI Z170 Gaming Pro Carbon, which connects the top PCIe x16 slot to the CPU. A middle slot also connects to the CPU to allow splitting the CPU&#8217;s 16 PCIe lanes into two sets of eight. The bottom slot connects to the PCH.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-TvM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-TvM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 424w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 848w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 1272w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-TvM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png" width="742" height="833" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:833,&quot;width&quot;:742,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-TvM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 424w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 848w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 1272w, https://substackcdn.com/image/fetch/$s_!-TvM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0afdf60-59c6-4b75-b423-63bcbae4da49_742x833.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Block diagram of the Z170 chipset from the MSI Z170 Gaming Pro Carbon&#8217;s manual, with the tested PCIe slots boxed in green.</figcaption></figure></div><p>Skylake&#8217;s CPU PCIe lanes provide excellent baseline latency at 535.59 ns, while going through the Z170 PCH adds 338 ns of latency. That&#8217;s high compared to regular DRAM access latency, but compares well against current generation desktop platforms.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tIb4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tIb4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 424w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 848w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 1272w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tIb4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png" width="386" height="528" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:528,&quot;width&quot;:386,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!tIb4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 424w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 848w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 1272w, https://substackcdn.com/image/fetch/$s_!tIb4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3ceaffb-683e-409c-8dd5-ff2545d930e8_386x528.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>L1 bandwidth on the T1000 isn&#8217;t affected when the GPU is attached to CPU PCIe lanes, though L2 hit bandwidth is strangely lower (though still well above VRAM bandwidth). Switching the card to the Z170&#8217;s PCH lanes reduces cache hit bandwidth to just over 51 GB/s.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0ihd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0ihd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 424w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 848w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 1272w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0ihd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png" width="1456" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0ihd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 424w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 848w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 1272w, https://substackcdn.com/image/fetch/$s_!0ihd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F373c1926-727d-46e1-a3b5-d41cf2d0da31_1600x682.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>AMD&#8217;s AM3+ Platform</h1><p>AMD&#8217;s Bulldozer and Piledriver chipsets use an older setup where the CPU has an integrated memory controller, but the chipset is still responsible for all PCIe connectivity. I&#8217;m testing on the ASUS M5A99X EVO R2.0, which implements the 990X chipset. The RD980 (990X) northbridge connects to the CPU via a 5.3 GT/s HyperTransport link, which I assume is running in 16-bit wide &#8220;ganged&#8221; mode. Two PCIe slots on the M5A99X connect to the RD980, and can run in either single x16 or 2x8 mode. A x4 A-Link Express III interface connects the northbridge to the SB950 southbridge, which provides more PCIe lanes and slower IO. A-Link Express is based on PCIe Gen 2, and runs at 5 GT/s.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Rhz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Rhz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 424w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 848w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 1272w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Rhz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png" width="694" height="897" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:897,&quot;width&quot;:694,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Rhz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 424w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 848w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 1272w, https://substackcdn.com/image/fetch/$s_!2Rhz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a2e96f5-97fd-46ac-9b3a-7a54047acf63_694x897.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Diagram of the 990X (RD980) northbridge from AMD&#8217;s datasheet</figcaption></figure></div><p>Baseline latency is very acceptable at 769.67 ns when using the 990X northbridge&#8217;s PCIe lanes, which puts it between AMD&#8217;s Zen 5 and Intel&#8217;s Arrow Lake platforms. That&#8217;s a very good latency figure considering the AM3+ platform&#8217;s fastest PCIe lanes don&#8217;t come directly from the CPU like on modern platforms.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ougb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ougb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 424w, https://substackcdn.com/image/fetch/$s_!ougb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 848w, https://substackcdn.com/image/fetch/$s_!ougb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 1272w, https://substackcdn.com/image/fetch/$s_!ougb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ougb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png" width="450" height="544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:544,&quot;width&quot;:450,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ougb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 424w, https://substackcdn.com/image/fetch/$s_!ougb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 848w, https://substackcdn.com/image/fetch/$s_!ougb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 1272w, https://substackcdn.com/image/fetch/$s_!ougb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09e2bfaa-8136-4f9b-be61-6613d435b1eb_450x544.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Going through the SB950 southbridge increases latency by 602 ns. It&#8217;s a higher latency increase compared to the PCH on Arrow Lake or one PROM21 chip on B650. However, southbridge latency is still better than going through two PROM21 chips on X670E.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MN5u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MN5u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 424w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 848w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 1272w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MN5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png" width="1456" height="586" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52119be8-69ab-47bc-a61e-076142436620_1600x644.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:586,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MN5u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 424w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 848w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 1272w, https://substackcdn.com/image/fetch/$s_!MN5u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52119be8-69ab-47bc-a61e-076142436620_1600x644.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On the 990X platform, the T1000&#8217;s cache hit bandwidth is capped at around 132 GB/s when connected using the northbridge PCIe lanes. With the southbridge PCIe lanes, cache hit bandwidth drops to just over 20 GB/s. Both figures are well in excess of the platform&#8217;s IO bandwidth. The 990X&#8217;s HyperTransport lanes can only deliver 10.5 GB/s in each direction, and cache hit bandwidth far exceeds that even when the T100 is attached to the southbridge. If the T1000&#8217;s cache hit bandwidth is limited by probe throughput, then the 990X chipset has 10x more probe throughput than is necessary to saturate its IO bandwidth. Of course, that&#8217;s nowhere near enough to cover GPU cache hit bandwidth.</p><h1>Sidebar: Latency from the CPU side</h1><p>As mentioned before, having the CPU access GPU VRAM is another way to measure PCIe latency. Host-visible device memory allocated on the T1000 is uncacheable from the CPU side, likely to ensure coherency. If caches aren&#8217;t in play, then small test sizes provide the best estimate of baseline latency because they don&#8217;t run into address translation penalties or page faults. Here, I&#8217;m plotting CPU-to-VRAM results at the 4 KB test size next to the GPU-to-system-memory results above. There&#8217;s some difference between the two measurement methods, but not enough to paint a different picture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kAHI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kAHI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 424w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 848w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 1272w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kAHI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png" width="1456" height="701" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:701,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kAHI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 424w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 848w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 1272w, https://substackcdn.com/image/fetch/$s_!kAHI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb82b361-bbe7-4df7-b621-508754ddbe29_1582x762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><h1>Final Words</h1><p>Chipsets today don&#8217;t take on performance critical roles, and accordingly aren&#8217;t optimized with low latency in mind. Going over chipset PCIe lanes adds hundreds of nanoseconds of latency, and imposes bandwidth restrictions. Chipsets can affect cache hit bandwidth too, possibly due to probe throughput constraints.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3LN3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3LN3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3LN3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3LN3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3LN3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91117566-86b0-4b58-a3cb-59533a501073_1600x1065.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">PCIe connector contacts on the T1000</figcaption></figure></div><p>Older chipsets like AMD&#8217;s 990X show it&#8217;s possible to deliver reasonable PCIe latency even with an external northbridge. But I don&#8217;t see latency optimizations coming into play on future chipsets. With the decline of multi-GPU setups, chipsets today largely serve high latency IO like SSDs and network adapters. Those devices have latency in the microsecond range, if not milliseconds. A few hundred nanoseconds won&#8217;t make a difference. The same applies for probe performance. Accessing host-coherent memory is a corner case for GPUs, which expect to mostly work out of fast on-device memory. Needless to say, cache hits for a VRAM-backed buffer won&#8217;t require sending probes to the CPU.</p><p>I expect future chipset designs to optimize for reduced cost and better connectivity options, while keeping pace with higher bandwidth demands from newer SSDs. I don&#8217;t expect latency or probe path optimizations, though it will be interesting to keep an eye out to see if behavior changes.</p>]]></content:encoded></item><item><title><![CDATA[Analyzing Nvidia GB10's GPU]]></title><description><![CDATA[Looking at Nvidia's latest effort to make a big iGPU]]></description><link>https://chipsandcheese.com/p/analyzing-nvidia-gb10s-gpu</link><guid isPermaLink="false">https://chipsandcheese.com/p/analyzing-nvidia-gb10s-gpu</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Sat, 14 Mar 2026 20:45:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AYcE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>GB10&#8217;s integrated GPU is its headline feature, to the extent that GB10 can be seen as a vehicle for delivering Nvidia&#8217;s Blackwell architecture in an integrated implementation. The iGPU features 48 Streaming Multiprocessors running at up to 2.55 GHz, meaning it&#8217;s basically a RTX 5070 integrated. The RTX 5070 still has an advantage thanks to a higher power budget, more cache, and more memory bandwidth. Still, GB10 undeniably has a powerful iGPU, much like AMD&#8217;s Strix Halo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AYcE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AYcE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AYcE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AYcE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!AYcE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3c9d21c-ca4a-46c1-bb7f-8d11bbbaac91_1600x901.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Nvidia&#8217;s slide from Hot Chips 2025, emphasizing GB10&#8217;s focus on AI compute</figcaption></figure></div><p>Unlike AMD&#8217;s Strix Halo, Nvidia&#8217;s GB10 focuses on AI applications. Nvidia&#8217;s proprietary CUDA ecosystem is key, because CUDA is nearly the only name in the GPU compute game. GPU compute applications are optimized first and foremost for CUDA and Nvidia GPUs. Everything else is an afterthought, if a thought at all.Combining significant GPU compute power with iGPU advantages and CUDA support gives GB10 a lot of potential. Of course, hardware has to deliver high performance for those ecosystem advantages to show. Achieving high performance involves building a balanced memory hierarchy, as well as tailoring hardware architecture to do well in targeted applications.</p><h1>Cache and Memory Access</h1><p>GB10&#8217;s GPU-side memory subsystem brings few surprises. It&#8217;s basically a Blackwell GPU integrated, with a familiar two-level caching setup. The L2 serves as both a high capacity last level cache and the first stop for L1 misses. AMD in contrast uses more cache levels and gradually increasing cache capacity at each level, which mirrors their discrete GPUs too. In a cache and memory latency test, GB10 and Strix Halo trade blows depending on test size. Larger caches are hard to optimize for high performance, so AMD wins when it&#8217;s able to hit in its smaller 256 KB L1 or 2 MB L2 compared to Nvidia&#8217;s higher latency 24 MB L2. Nvidia wins when GB10 can keep accesses within L2 while AMD has to service them out of Strix Halo&#8217;s 32 MB memory side cache. Both Strix Halo and GB10 use LPDDR5X memory. Strix Halo&#8217;s iGPU enjoys better latency to LPDDR5X than Nvidia&#8217;s GB10, reversing the situation seen on the CPU side.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aUdO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aUdO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 424w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 848w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 1272w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aUdO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aUdO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 424w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 848w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 1272w, https://substackcdn.com/image/fetch/$s_!aUdO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1393d830-0d11-4d8c-bfd9-e7adf47f7752_1600x741.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nvidia&#8217;s L1 cache offers an impressive combination of low latency and high capacity. With dependent array accesses, it can return data nearly as fast as AMD&#8217;s 16 KB scalar cache while offering more capacity than AMD&#8217;s first level scalar and vector caches combined. However, dependent array accesses don&#8217;t tell the whole story. Pointer dereferences indicate that AMD takes a lot of overhead from array addressing calculations. Cutting that out shows that RDNA3.5&#8217;s scalar cache offers very low latency. Nvidia already had efficient array address generation, so switching to pointer dereferences makes less of a difference on GB10. Regardless of access method, GB10&#8217;s L1 offers higher capacity and lower latency than RDNA3.5&#8217;s L0 vector cache.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XczT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XczT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 424w, https://substackcdn.com/image/fetch/$s_!XczT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 848w, https://substackcdn.com/image/fetch/$s_!XczT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 1272w, https://substackcdn.com/image/fetch/$s_!XczT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XczT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png" width="596" height="85" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:85,&quot;width&quot;:596,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XczT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 424w, https://substackcdn.com/image/fetch/$s_!XczT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 848w, https://substackcdn.com/image/fetch/$s_!XczT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 1272w, https://substackcdn.com/image/fetch/$s_!XczT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a135ff2-98f4-4843-a616-fd089c1ddb67_596x85.png 1456w" sizes="100vw"></picture><div></div></div></a><figcaption class="image-caption">Measurements taken at 4 KB</figcaption></figure></div><p>GB10&#8217;s system level cache is invisible in the latency plot above because it&#8217;s smaller than the GPU&#8217;s 24 MB L2. It also doesn&#8217;t seem to maintain a mostly exclusive relationship with the GPU L2, so its capacity isn&#8217;t additive on top of the L2. Nvidia&#8217;s slides say the SLC aims to enable "power efficient data-sharing between engines", implying that feeding compute isn't its primary purpose.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MdqV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MdqV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 424w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 848w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 1272w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MdqV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png" width="1456" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MdqV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 424w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 848w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 1272w, https://substackcdn.com/image/fetch/$s_!MdqV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe320f23e-a05b-4af7-86ec-4934668b3872_1596x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Like many modern GPUs, GB10 supports OpenCL&#8217;s Shared Virtual Memory (SVM) feature. SVM allows seamless pointer sharing by mapping a buffer into the same virtual address space on both the CPU and GPU. GB10 supports &#8220;coarse grained&#8221; SVM, meaning the buffer cannot be simultaneously mapped on both the CPU and GPU. Some GPUs with only coarse grained SVM appear to copy the entire buffer under the hood, as can be seen with Qualcomm and Arm iGPUs. Thankfully, GB10 can handle this feature without requiring a full buffer copy under the hood.</p><p>Compared to Strix Halo, GB10&#8217;s memory subsystem strategy moves high capacity caching up a level. Strix Halo&#8217;s &#8220;Infinity Cache&#8221; setup was awkward because it placed a GPU-focused cache at the other end of a system level interconnect. That places high pressure on Infinity Fabric, which AMD built to handle nearly 1 TB/s of bandwidth. GB10 in contrast uses the 24 MB graphics L2 to filter out a large portion of GPU memory traffic, so the system level interconnect primarily deals with DRAM accesses. That said, using 16 MB of cache just to optimize data sharing feels awkward too. SRAM is one of the biggest area consumers in modern chips because of how far compute has outpaced DRAM performance. A hypothetical setup that concentrates last level caching capacity in the SLC could be more area efficient, and could boost CPU performance.</p><h2>Bandwidth Measurements</h2><p>48 SMs mean 48 L1 cache instances, giving GB10&#8217;s iGPU impressive cache hit bandwidth. With Nemes&#8217;s Vulkan based benchmark, GB10 easily outpaces Strix Halo. Strix Halo&#8217;s iGPU can achieve around 10 TB/s using a narrowly targeted OpenCL kernel that can&#8217;t work with arbitrary test sizes, but GB10 is still ahead even with that methodology. Nvidia&#8217;s advantage continues at L2, which offers both higher bandwidth and higher capacity than AMD&#8217;s 2 MB L2. Needless to say, Nvidia&#8217;s L2 continues to compare well as test sizes increase and Strix Halo has to use its 32 MB Infinity Cache. Finally, both iGPUs achieve similar bandwidth from system memory, with GB10 taking a slight lead with faster LPDDR5X.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!306E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!306E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 424w, https://substackcdn.com/image/fetch/$s_!306E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 848w, https://substackcdn.com/image/fetch/$s_!306E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 1272w, https://substackcdn.com/image/fetch/$s_!306E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!306E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png" width="1456" height="637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e6291f7-e153-430c-8860-f8223db08440_1600x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:637,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!306E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 424w, https://substackcdn.com/image/fetch/$s_!306E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 848w, https://substackcdn.com/image/fetch/$s_!306E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 1272w, https://substackcdn.com/image/fetch/$s_!306E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e6291f7-e153-430c-8860-f8223db08440_1600x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both GPUs use write-through L1 caches, so writes from compute kernels go straight to L2. GB10&#8217;s L2 handles writes at half rate, and provides just under 1 TB/s of write bandwidth. Write bandwidth on Strix Halo is somewhat lower at approximately 700 GB/s. Write behavior on both iGPUs<a href="https://clamtech.org/?dest=gpuwrite"> is similar to that of larger discrete GPUs</a>.</p><h1>L1 and Shared Memory Capacity</h1><p>Nvidia&#8217;s GPUs use a single block of SM-private memory for both L1 cache and Shared Memory. Shared Memory is a software managed scratchpad private to a group of threads. It fills an analogous role to AMD&#8217;s Local Data Share (LDS), letting code explicitly keep data close to compute when data movement is predictable. A larger L1/Shared Memory block can let each group of threads use more scratchpad space with lower impact to occupancy or L1 cache capacity. Nvidia&#8217;s datacenter GPUs get 256 KB of L1/Shared Memory per SM. GB10 does not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hhep!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hhep!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 424w, https://substackcdn.com/image/fetch/$s_!hhep!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 848w, https://substackcdn.com/image/fetch/$s_!hhep!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 1272w, https://substackcdn.com/image/fetch/$s_!hhep!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hhep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png" width="1456" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hhep!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 424w, https://substackcdn.com/image/fetch/$s_!hhep!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 848w, https://substackcdn.com/image/fetch/$s_!hhep!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 1272w, https://substackcdn.com/image/fetch/$s_!hhep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53554a71-875b-4123-8776-c7d0bced12e2_1600x649.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Instead, GB10 behaves much like Nvidia&#8217;s consumer Blackwell architecture and has 128 KB of L1/Shared Memory. In exchange for lower L1/Shared Memory capacity, GB10 achieves 13 ns of latency in Shared Memory using dependent array accesses. That&#8217;s slightly better than 14.84 ns for B200.</p><h1>Instruction Caching</h1><p>Instruction fetch behaves as expected for any Blackwell GPU. Each SM sub-partition has a 32 KB L0 instruction cache, which can fully feed a partition&#8217;s maximum throughput of one instruction per cycle. Blackwell uses fixed length 16-byte instructions, so the A 128 KB L1 instruction cache feeds all four partitions, and can also deliver one instruction per cycle. Bandwidth from the L1I is shared by all four partitions, and per-partition throughput drops to 0.5 IPC when two partitions contend for L1I access. GPU shader programs are typically small and simple compared to CPU programs, so I suspect most code fetches will hit in the 32 KB L0 instruction caches.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AJyO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AJyO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 424w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 848w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 1272w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AJyO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png" width="1031" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1031,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AJyO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 424w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 848w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 1272w, https://substackcdn.com/image/fetch/$s_!AJyO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a2543b-217b-4490-bc83-9424ff2e1a2b_1031x522.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>RDNA3.5 Workgroup Processors use a simpler instruction caching strategy, with one 32 KB instruction cache servicing four SIMDs. Bringing a second wave into play does not cause bandwidth contention at the L1I, though two waves executing different code sections do contend for L1I capacity. Running code from L2 leads to poor instruction throughput on both architectures.</p><h1>Compute Performance</h1><p>If Strix Halo had high compute throughput for an integrated GPU, GB10 takes things a step further. Strix Halo&#8217;s iGPU has 20 Workgroup Processors (WGPs) to GB10&#8217;s 48 SMs. These basic building blocks are the closest analogy to cores on a CPU. RDNA3.5 WGPs do have doubled execution units for basic operations and higher clocks. But even in the best case, Strix Halo&#8217;s iGPU lands a bit behind GB10&#8217;s.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LceQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LceQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 424w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 848w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 1272w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LceQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png" width="1201" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1201,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LceQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 424w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 848w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 1272w, https://substackcdn.com/image/fetch/$s_!LceQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1fe6a60-a0f2-4ff9-9f1a-a5fb0b0b4981_1201x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both GPUs have low FP64 throughput, another characteristic that sets them apart from datacenter ones. GB10 has a 1:64 FP64 to FP32 ratio, while Strix Halo has a 1:32 ratio.</p><h1>Which Blackwell?</h1><p>Nvidia&#8217;s GB10 presentation at Hot Chips 2025 started by talking about scaling their Blackwell architecture from server racks, to single GB300 GPUs, to the GB10 chip in DGX Spark. GB300, like B200, uses Nvidia&#8217;s B100 datacenter Blackwell architecture. The datacenter distinction is important. Ever since Pascal, Nvidia&#8217;s datacenter and consumer GPU architectures look quite different even when they use the same name. Cache setup and FP64 performance are often differentiating factors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vl00!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vl00!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 424w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 848w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 1272w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vl00!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png" width="1456" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vl00!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 424w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 848w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 1272w, https://substackcdn.com/image/fetch/$s_!Vl00!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61a449d0-3ce4-4fb6-81e5-fcc89f8498de_1600x898.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">One of the opening slides in Nvidia&#8217;s presentation on GB10 at Hot Chips 2025</figcaption></figure></div><p>GB10&#8217;s GPU uses Blackwell&#8217;s consumer variant. It supports compute capability 12.1, putting under the same column that describes Nvidia&#8217;s RTX 5xxx consumer GPUs in <a href="https://docs.nvidia.com/cuda/cuda-programming-guide/05-appendices/compute-capabilities.html#family2">Nvidia's chart</a>. B200 in contrast<a href="https://developer.nvidia.com/cuda/gpus"> has compute capability 10.0</a>. Compared to GB10&#8217;s SMs, datacenter variants can keep more work in flight, have more L1/Shared Memory, and more FP64 units. 5th generation Tensor Core features are also limited to the datacenter variant.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wlTv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wlTv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 424w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 848w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 1272w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wlTv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png" width="599" height="108" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:108,&quot;width&quot;:599,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9293,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/190912586?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wlTv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 424w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 848w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 1272w, https://substackcdn.com/image/fetch/$s_!wlTv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cd95409-5010-4619-b2d3-530e8db354ce_599x108.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p>These differences mean that GB10 doesn&#8217;t act like a smaller version of Nvidia&#8217;s Blackwell datacenter GPUs. Presenting GB10 as having the same architecture creates confusion, as can be seen in <a href="https://forums.developer.nvidia.com/t/sm121-cutlass-kernel-optimization-results-nvfp4-356-tflops-moe-grouped-gemm-on-dgx-spark/359960">forum posts</a> and<a href="https://github.com/NVIDIA/dgx-spark-playbooks/issues/22"> GitHub issues</a>, where kernels written for Nvidia datacenter GPUs can't run on GB10. Even if code doesn&#8217;t use any features specific to datacenter Blackwell, GB10 presents a different optimization target. Saying GB10 uses the same architecture as GB300 would be like saying Strix Halo&#8217;s RDNA3.5 iGPU uses the same architecture as MI300X. It&#8217;s great for investors, but not useful for a more technical audience where accuracy is important. I feel it would have been better to present GB10&#8217;s GPU as a member of Nvidia&#8217;s consumer RTX lineup. Consumer GPU architectures are perfectly capable of handling a wide variety of compute applications. Nvidia could still position GB10 as a compute-focused GPU.</p><p>Nvidia&#8217;s consumer Blackwell variant has advantages too. It has hardware raytracing acceleration, which can be useful in workloads like Blender with Optix. Consumer SMs clock higher than their datacenter counterparts, which can reduce per-wave latency on code that doesn&#8217;t benefit from any of datacenter Blackwell&#8217;s advantages. Finally, consumer Blackwell SMs take less area, which translates to more vector compute throughput with a given die area budget.</p><h1>Some Light Benchmarking</h1><p>FluidX3D simulates fluid behavior using the lattice Boltzmann method. It carries out computations using FP32, but can store data in 16-bit floating point formats to reduce pressure on memory bandwidth and capacity. FP16S mode stores values in the IEEE-754 standard 16-bit floating point format, and benefits from hardware instructions when converting between FP32 and FP16. FP16C uses a custom 16-bit floating point format to minimize accuracy loss. Conversions between the custom FP16 format and FP32 have to be done in software, increasing the compute to bandwidth ratio.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TSai!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TSai!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 424w, https://substackcdn.com/image/fetch/$s_!TSai!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 848w, https://substackcdn.com/image/fetch/$s_!TSai!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 1272w, https://substackcdn.com/image/fetch/$s_!TSai!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TSai!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png" width="1176" height="637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:637,&quot;width&quot;:1176,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TSai!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 424w, https://substackcdn.com/image/fetch/$s_!TSai!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 848w, https://substackcdn.com/image/fetch/$s_!TSai!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 1272w, https://substackcdn.com/image/fetch/$s_!TSai!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F330d1be1-5c82-4c55-b5e9-4387b416ab2c_1176x637.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Not using the default benchmark because it seems to run a grid with zero velocity in all cells. B580 gets ridiculously high scores. <a href="https://clamtech.org/?dest=b580comp">Profiling the workload suggests it&#8217;s using some form of memory compression or access elimination.</a></figcaption></figure></div><p>GB10 leads AMD&#8217;s Strix Halo when running FluidX3D&#8217;s delta wing simulation example with FP32, and continues to have an edge in FP16S. However, it falls behind in FP16C mode. Intel&#8217;s Arc B580 nominally sits in the same performance segment as Strix Halo, but manages a huge performance lead over both iGPUs. VRAM bandwidth is likely a factor. FluidX3D has poor memory access locality and therefore tends to be bound by VRAM bandwidth. Besides reporting simulation speed, FluidX3D outputs memory bandwidth figures that account for accesses made from its algorithm&#8217;s perspective.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qzjr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qzjr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 424w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 848w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 1272w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qzjr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png" width="1169" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1169,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qzjr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 424w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 848w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 1272w, https://substackcdn.com/image/fetch/$s_!qzjr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8991dce2-aa90-4dd6-97d7-ae2dad53feaa_1169x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>These accesses can include cache hits, which are transparent to user code. Profiling FluidX3D on Intel&#8217;s Arc B580 however shows that FluidX3D&#8217;s reported bandwidth aligns well with VRAM bandwidth usage from performance monitoring metrics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T_k3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T_k3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 424w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 848w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 1272w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T_k3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png" width="767" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:767,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T_k3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 424w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 848w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 1272w, https://substackcdn.com/image/fetch/$s_!T_k3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11a47bf0-d5a4-4607-a105-2dc5f07bf63f_767x624.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Profiling data from VTune on the Arc B580</figcaption></figure></div><p>GB10 and Strix Halo use 8533 and 8000 MT/s LPDDR5X over a 256-bit bus, giving 273 and 256 GB/s of theoretical bandwidth, respectively. That&#8217;s very high bandwidth for a consumer platform, but can&#8217;t compare to the 456 GB/s that the Arc B580 gets from GDDR6 over a 192-bit bus. Lack of memory bandwidth mean both iGPUs lean harder on caching than their discrete counterparts, and could suffer in workloads with poor access locality.</p><p>VkFFT implements Fast Fourier Transforms in multiple APIs. Here, I&#8217;m running its Vulkan benchmarks. GB10 leads Strix Halo, and turns in the most consistent performance across different test configurations. Strix Halo&#8217;s performance is more varied, and never beats GB10&#8217;s. Intel&#8217;s B580 is all over the place. It ends up with a higher average score than GB10, but takes huge losses in certain test configurations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NVuT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NVuT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 424w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 848w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 1272w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png" width="1385" height="1479" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1479,&quot;width&quot;:1385,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NVuT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 424w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 848w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 1272w, https://substackcdn.com/image/fetch/$s_!NVuT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fead9163c-0a65-4e3e-9c92-4c54208df6cf_1385x1479.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>VkFFT outputs bandwidth figures alongside test results, and again shows what a discrete GPU can do with a fast, dedicated memory pool.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jfxd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jfxd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 424w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 848w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 1272w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jfxd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png" width="1383" height="1477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1477,&quot;width&quot;:1383,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jfxd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 424w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 848w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 1272w, https://substackcdn.com/image/fetch/$s_!jfxd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf6ab840-1df8-4f68-abeb-30dc92425aa7_1383x1477.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>VkFFT can also target double precision (FP64) and half precision (FP16). GB10 is able to take an overall lead with double precision, while Strix Halo still can&#8217;t pull ahead in any test. Intel&#8217;s B580 can use its bandwidth advantage to win in some configurations, but not enough to get an overall score lead. None of the three GPUs here have strong FP64 performance, but VkFFT is often memory bound. The primary challenge is feeding the execution units, not having more of them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3X07!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3X07!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 424w, https://substackcdn.com/image/fetch/$s_!3X07!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 848w, https://substackcdn.com/image/fetch/$s_!3X07!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 1272w, https://substackcdn.com/image/fetch/$s_!3X07!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3X07!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png" width="1388" height="1482" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1482,&quot;width&quot;:1388,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3X07!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 424w, https://substackcdn.com/image/fetch/$s_!3X07!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 848w, https://substackcdn.com/image/fetch/$s_!3X07!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 1272w, https://substackcdn.com/image/fetch/$s_!3X07!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c5d78-94e2-4081-b463-fa2e6deded8e_1388x1482.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Half precision runs tell a similar story, but with narrower gaps between the three GPUs. GB10, Strix Halo, and B580 average 76793.25, 62498.79, and 70980.17 respectively across all of the half precision test configurations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!54tc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!54tc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 424w, https://substackcdn.com/image/fetch/$s_!54tc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 848w, https://substackcdn.com/image/fetch/$s_!54tc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 1272w, https://substackcdn.com/image/fetch/$s_!54tc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!54tc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png" width="1383" height="1487" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1487,&quot;width&quot;:1383,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!54tc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 424w, https://substackcdn.com/image/fetch/$s_!54tc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 848w, https://substackcdn.com/image/fetch/$s_!54tc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 1272w, https://substackcdn.com/image/fetch/$s_!54tc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62ac2e7f-2980-4515-b1f8-809920476e8f_1383x1487.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>In this self-penned benchmark, I perform a brute-force gravitational potential calculation over column density values. It&#8217;s a GPU accelerated version of something I did during a high school internship. Datacenter GPUs tend to do very well because I use FP64 values. GB10 and Strix Halo fall very far behind.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y1id!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y1id!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!y1id!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!y1id!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!y1id!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y1id!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png" width="1192" height="653" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:653,&quot;width&quot;:1192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y1id!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 424w, https://substackcdn.com/image/fetch/$s_!y1id!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 848w, https://substackcdn.com/image/fetch/$s_!y1id!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 1272w, https://substackcdn.com/image/fetch/$s_!y1id!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F106a5a2c-42f2-403b-9585-77cf2145d1eb_1192x653.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>FAHBench is a benchmark for Folding at Home, a distributed computing project that runs protein folding simulations. The benchmark comes with three workloads, which can be run with either single or double precision. Two workloads use a DHFR (dihydrofolate reductase) system in explicit or implicit mode. Explicit simulates all 23558 atoms, while implicit only uses 2489 atoms to approximate a solution. NAV, or voltage gated sodium channel, is the third workload and uses 173112 atoms.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eXnH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eXnH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 424w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 848w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 1272w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eXnH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png" width="1456" height="1337" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1337,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eXnH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 424w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 848w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 1272w, https://substackcdn.com/image/fetch/$s_!eXnH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d6e0cae-8d4b-470c-b2bb-3eae275925ff_1495x1373.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GB10 easily tops FAHBench&#8217;s workloads in single precision mode. FAHBench makes good use of local memory to keep data on-chip, reducing pressure on the global memory hierarchy. Global memory accesses that do happen have reasonable locality, even on the heavier NAV workload. Data Fabric performance monitoring on Strix Halo shows about 200 GB/s of GPU memory requests, and ~50 GB/s of traffic at the Unified Memory Controllers. That implies Strix Halo&#8217;s 32 MB system level cache caught about 150 GB/s of memory traffic. GB10&#8217;s 24 MB L2 is likely almost as effective, which would mean Nvidia also avoids DRAM bandwidth bottlenecks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BXeA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BXeA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 424w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 848w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 1272w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BXeA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png" width="877" height="413" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd823658-98ae-461e-b371-2421fb13cfa5_877x413.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:413,&quot;width&quot;:877,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BXeA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 424w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 848w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 1272w, https://substackcdn.com/image/fetch/$s_!BXeA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd823658-98ae-461e-b371-2421fb13cfa5_877x413.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As a result, FAHBench favors compute throughput and plays well into GB10&#8217;s advantages. In some cases though, Strix Halo can keep up surprisingly well. In the DHFR single precision  test, Strix Halo starts with a score just above 200 ns/day. If it maintained that, it would have nearly tied GB10. However, performance quickly drops as the chip pushes past 90 C. By targeting thin and light devices, Strix Halo has to contend with tighter thermal and power constraints than GB10 does in mini-PCs. Dell&#8217;s cooling in their mini-PC is excellent, and manages to keep GB10 in the low 60 C range during FAHBench runs.</p><p>Double precision workloads on FAHBench are very compute bound on all three GPUs. Intel&#8217;s B580 has the least horrible FP64 to FP32 ratio at 1:16. Strix Halo comes next at 1:32, followed by GB10 at 1:64.</p><h1>Final Words</h1><p>Large iGPUs are fascinating, and it&#8217;s great to see Nvidia entering the scene. GB10&#8217;s GPU has plenty of strengths compared to the one in AMD&#8217;s Strix Halo. It uses Nvidia&#8217;s latest Blackwell architecture, while Strix Halo makes do with RDNA3.5 instead of AMD&#8217;s latest RDNA4. Nvidia also uses TSMC&#8217;s 3nm process, while Strix Halo uses 4nm. GB10&#8217;s GPU is sized to be a bit larger than Strix Halo&#8217;s. Compute benchmarks show GB10&#8217;s strength. It beats Strix Halo more often than not. All else being equal, GB10&#8217;s iGPU could be a dangerous competitor to Strix Halo and a welcome arrival for PC enthusiasts.</p><p>Unfortunately, gaming on GB10 is off to a rough start. iGPU setups are locked into whichever CPU they come with. GB10 uses ARM CPU cores, and quite excellent ones from an ISA-agnostic point of view. However, ISA does matter when software can&#8217;t be easily ported. Nearly all PC games target x86-64, are closed source, and lack 64-bit Arm ports. Emulation and binary translation may let some games run, but at a performance cost. For example, Retrotom on Reddit ran Cyberpunk 2077 on GB10 and<a href="https://www.reddit.com/r/nvidia/comments/1opofuj/cyberpunk_2077_runs_on_dgx_spark/"> achieved about 50 FPS</a> at 1080P with medium settings. Similar settings on Strix Halo in Cyberpunk 2077's built-in benchmark<a href="https://old.chipsandcheese.com/2025/10/31/37437/"> give just under 90 FPS</a>. Strix Halo already suffered a CPU performance penalty because of contention at the memory bus and high memory latency compared to desktop discrete GPU setups. Taking another heavy hit to CPU performance is a tough pill to swallow.</p><p>Nvidia understandably positions GB10 as a compute solution, aimed at developers who want to test their code without going to a datacenter GPU. Developers can recompile their application to target 64-bit Arm, minimizing compatibility issues. GB10&#8217;s strong compute performance, high memory capacity, and iGPU advantages play well into this purpose. In the compute role however, GB10 has a VRAM bandwidth disadvantage compared to discrete peers.</p><p>The factors above mean GB10 is an exciting product, but possibly one with a limited audience. Like Strix Halo, GB10&#8217;s unified memory and more compact form factor have to face off against iGPU compromises and an extremely high price compared to discrete GPU setups with similar compute throughput. I hope to see both Nvidia and AMD iterate on their large iGPU designs to bring price down and minimize their compromises. There&#8217;s a lot of potential for large iGPU designs, and I&#8217;m excited to see that segment develop.</p>]]></content:encoded></item><item><title><![CDATA[Arm's Cortex X925: Reaching Desktop Performance]]></title><description><![CDATA[A big, high performance core from Arm]]></description><link>https://chipsandcheese.com/p/arms-cortex-x925-reaching-desktop</link><guid isPermaLink="false">https://chipsandcheese.com/p/arms-cortex-x925-reaching-desktop</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Tue, 03 Mar 2026 06:51:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/223f5104-d87f-44cc-a80e-398a444bfd06_2010x1260.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Desktop and laptop use cases demand high single threaded performance across a large variety of workloads. Creating CPU cores to meet those demands is no easy task. AMD and Intel traditionally dominated this high performance segment using high clocked, high throughput cores with large out-of-order engines to absorb latency. Arm traditionally optimized for low power and low area, and not necessarily maximum performance. Over the years though, Arm steadily built more complex cores and looked for opportunities to expand into higher performance segments. Matching the best from Intel and AMD must have been a distant dream in 2012, when Arm launched their first 64-bit core, the Cortex A57. Today, that dream is a reality.</p><p>Cortex X925 in Nvidia&#8217;s GB10 achieves performance parity with AMD&#8217;s Zen 5 and Intel&#8217;s Lion Cove in their fastest desktop implementations. That gives Arm a core fast enough to not just play in laptop segments, but potentially in the most performance sensitive desktop applications too. Nvidia&#8217;s GB10 uses ten X925 cores, split across two clusters. One of those X925 cores reaches 4 GHz, while the others are not far behind at 3.9 GHz. Dell uses the GB10 chip in their Pro Max series, and we&#8217;re grateful to Dell for letting us test that product.</p><h1>Overview</h1><p>Arm&#8217;s Cortex X925 is a massive 10-wide core with a lot of everything. It has more reordering capacity than AMD&#8217;s Zen 5, and L2 capacity comparable to that of Intel&#8217;s recent P-Cores. Unlike Arm&#8217;s 7-series cores, X925 makes few concessions to reduce power and area. It&#8217;s a core designed through and through to maximize performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!antv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!antv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 424w, https://substackcdn.com/image/fetch/$s_!antv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 848w, https://substackcdn.com/image/fetch/$s_!antv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 1272w, https://substackcdn.com/image/fetch/$s_!antv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!antv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png" width="1222" height="614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7650892-030e-45d9-b45f-8dff46371b19_1222x614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1222,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!antv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 424w, https://substackcdn.com/image/fetch/$s_!antv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 848w, https://substackcdn.com/image/fetch/$s_!antv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 1272w, https://substackcdn.com/image/fetch/$s_!antv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7650892-030e-45d9-b45f-8dff46371b19_1222x614.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Rough block diagram of the Cortex X925&#8217;s microarchitecture</figcaption></figure></div><p>In Arm tradition, X925 has a number of configuration options. However, X925 omits the shoestring budget options present for A725. X925&#8217;s caches are all either parity or ECC protected, dropping A725&#8217;s option to do without error detection or correction. L1 caches on X925 are fixed at 64 KB, removing the 32 KB options on A725. X925&#8217;s most significant configuration options happen at L2, where implementers can pick between 2 MB or 3 MB of capacity. They can also choose either a 128-bit or 256-bit ECC granule to make area and reliability tradeoffs.</p><p>X925 interfaces with the rest of the system via Arm&#8217;s DSU-120, which acts as a cluster-level interconnect and hosts a L3 cache with up to 32 MB of capacity. X925 and its DSU support 40-bit physical addresses, which is adequate for consumer systems. However, it&#8217;s clearly not designed for server applications, where larger 48-bit or even 52-bit physical address spaces are common.</p><h1>Branch Prediction</h1><p>Performance and power efficiency starts with good branch prediction. Arm knows this, and X925 doesn&#8217;t disappoint. Its branch predictor can recognize extremely long repeating patterns. In a test with branches that are taken or not-taken in random patterns of increasing lengths, X925 behaves a lot like AMD&#8217;s Zen 5. AMD&#8217;s cores have featured very strong branch predictors since Zen 2, so X925&#8217;s results are impressive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hXuY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hXuY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 424w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 848w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 1272w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hXuY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png" width="1456" height="748" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:748,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hXuY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 424w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 848w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 1272w, https://substackcdn.com/image/fetch/$s_!hXuY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6c7de3e-0d19-490d-849a-5f107b6a7cab_1600x822.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZmAW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZmAW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 424w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 848w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 1272w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZmAW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png" width="1456" height="751" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:751,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZmAW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 424w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 848w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 1272w, https://substackcdn.com/image/fetch/$s_!ZmAW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3c6182e-c5f7-402f-9717-3fa0917247e0_1600x825.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cortex X925&#8217;s branch target caching compares well too. Arm has a large first level BTB capable of handling two taken branches per cycle. Capacity for this first level BTB varies with branch spacing, but it seems capable of tracking up to 2048 branches. This large capacity brings X925&#8217;s branch target caching strategy closer to Zen 5&#8217;s, rather than prior Arm cores that used small micro-BTBs with 32 to 64 entries. For larger branch footprints, X925 has slower BTB levels that can track up to 16384 branches and deliver targets with 2-3 cycle latency. There may be a mid-level BTB with 4096 to 8192 entries, though it&#8217;s hard to tell.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p9xQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p9xQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 424w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 848w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 1272w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p9xQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png" width="1456" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p9xQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 424w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 848w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 1272w, https://substackcdn.com/image/fetch/$s_!p9xQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0270e9a-0dac-43b9-a6af-289c93029c55_1600x806.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Compared to AMD&#8217;s Zen 5, X925 has roughly comparable capacity in its fastest BTB level depending on branch spacing. Zen 5 has more maximum branch target caching capacity, especially when it can use a single BTB entry to track two branches. Still, X925 has more branch target storage than Arm cores from a few years ago. Cortex X2 for example topped out at about 10K branch targets.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HSzf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HSzf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 424w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 848w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 1272w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HSzf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png" width="1456" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HSzf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 424w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 848w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 1272w, https://substackcdn.com/image/fetch/$s_!HSzf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0871847-cebe-4a8c-b87a-e77d2a6c347f_1600x848.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A 29 entry return stack helps predict returns from function calls, or branch-with-link in Arm instruction terms. Like Intel&#8217;s Sunny Cove and later cores, the return stack doesn&#8217;t work if return sites aren&#8217;t spaced far enough apart. I spaced the test &#8220;function&#8221; by 128 bytes to get clear results.</p><p>In SPEC CPU2017, Cortex X925 achieves branch prediction accuracy roughly on par with AMD&#8217;s Zen 5 across most tests, and may even be slightly ahead. 505.mcf and 541.leela consistently challenge branch predictors, and X925 pulls ahead in both. Intel&#8217;s Lion Cove is a bit behind both Zen 5 and X925.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QAil!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QAil!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 424w, https://substackcdn.com/image/fetch/$s_!QAil!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 848w, https://substackcdn.com/image/fetch/$s_!QAil!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 1272w, https://substackcdn.com/image/fetch/$s_!QAil!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QAil!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png" width="1011" height="828" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:828,&quot;width&quot;:1011,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QAil!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 424w, https://substackcdn.com/image/fetch/$s_!QAil!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 848w, https://substackcdn.com/image/fetch/$s_!QAil!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 1272w, https://substackcdn.com/image/fetch/$s_!QAil!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbad4a9fc-3974-4d8c-bc78-8f16f446a0aa_1011x828.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QMjv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QMjv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 424w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 848w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 1272w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QMjv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png" width="1068" height="885" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:885,&quot;width&quot;:1068,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QMjv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 424w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 848w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 1272w, https://substackcdn.com/image/fetch/$s_!QMjv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ece51e-a300-4e59-b86c-88c31b12a05c_1068x885.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>SPEC&#8217;s floating point workloads are gentler on the branch predictor, but X925 still shows its strength. It&#8217;s again on par or slightly better than Zen 5.</p><h1>Instruction Fetch and Decode</h1><p>Cortex X925 ditches the MOP cache from prior Arm generations, much like mid-core companion (A725). While Cortex X925 doesn&#8217;t have the same tight power and area restrictions as A725, many of the same justifications apply. Arm already tackles decode costs through a variety of measures like predecode and running at lower clock speeds. A MOP cache would be excessive.</p><p>On the predecode side, X925&#8217;s <a href="https://developer.arm.com/documentation/102807/0002/Direct-access-to-internal-memory/L1-cache-encodings/L1-instruction-data-RAM-returned-data">TRM </a>suggests the L1I stores data at 76-bit granularity. Arm instructions are 32-bits, so 76 bits would store two instructions and 12 bits of overhead. Unlike A725, Arm doesn&#8217;t indicate that any subset of bits correspond to an aarch64 opcode. They may have neglected to document it, or X925&#8217;s L1I may store instructions in an intermediate format that doesn&#8217;t preserve the original opcodes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Msl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Msl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 424w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 848w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 1272w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Msl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png" width="1456" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Msl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 424w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 848w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 1272w, https://substackcdn.com/image/fetch/$s_!-Msl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9edaf1d0-0b76-45d3-90f9-2ebaf5e5dd88_1600x719.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Zen 5 can achieve 8 instructions per cycle with 8B NOPs. There seems to be a strange limitation with 4B NOPs, but I&#8217;m using the same NOP size for all cores for consistency</figcaption></figure></div><p>X925&#8217;s frontend can sustain 10 instructions per cycle, but strangely has lower throughput when using 4 KB pages. Using 2 MB pages lets it achieve 10 instructions per cycle as long as the test fits within the 64 KB instruction cache. Cortex X925 can fuse NOP pairs into a single MOP, but that fusion doesn&#8217;t bring throughput above 10 instructions per cycle. Details aside, X925 has high per-cycle frontend throughput compared to its x86-64 peer, but slightly lower actual throughput when considering Zen 5 and Lion Cove&#8217;s much higher clock speed. With larger code footprints, Cortex X925 continues to perform well until test sizes exceed L2 capacity. Compared to X925, AMD&#8217;s Zen 5 relies on its op cache to deliver high throughput for a single thread.</p><h1>Rename and Allocate</h1><p>MOPs from the frontend go through register renaming and have various bookkeeping resources allocated for them, letting the backend carry out out-of-order execution while ensuring results are consistent with in-order execution. While allocating resources, the core can carry out various optimizations to expose additional parallelism. X925 can do move elimination like prior Arm cores, and has special handling for moving an immediate value of zero into a register. Like on A725, the move elimination mechanism tends to fail if there are enough register-to-register MOVs close by. Neither optimization can be carried out at full renamer width, though that&#8217;s typical as cores get very wide.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZDua!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZDua!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 424w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 848w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 1272w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZDua!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png" width="705" height="106" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:106,&quot;width&quot;:705,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZDua!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 424w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 848w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 1272w, https://substackcdn.com/image/fetch/$s_!ZDua!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87732d04-46db-48bb-a7ee-175c0201ec6d_705x106.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Unlike A725, X925 does not have special handling for PTRUE, which sets a SVE predicate register to enable all lanes. A725 could eliminate PTRUE like a zeroing idiom, and process it without allocating a physical register. While this is a minor detail, it does show divergence between Arm&#8217;s mid-core and big-core lines.</p><h1>Out-of-Order Execution</h1><p>A CPU&#8217;s out-of-order backend executes operations as their inputs become ready, letting the core keep its execution units fed while waiting for long latency instructions to complete. Different sources give conflicting information about Cortex X925&#8217;s reordering window. <a href="https://www.androidauthority.com/arm-cortex-x925-g925-explained-3445480/">Android Authority</a> claims 750 MOPs. <a href="https://fuse.wikichip.org/news/7761/arm-launches-next-gen-flagship-cortex-x925/">Wikichip </a>believes it&#8217;s 768 instructions, based off an Arm slide that states Cortex X925 doubled reordering capacity over Cortex X4. Testing shows X925 can keep 948 NOPs in flight, which doesn&#8217;t correspond with either figure unless NOP fusion only worked some of the time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J9vt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J9vt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 424w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 848w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 1272w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J9vt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png" width="1456" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J9vt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 424w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 848w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 1272w, https://substackcdn.com/image/fetch/$s_!J9vt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ab69b2d-1480-4ae2-92e8-befb3dadf2b2_1600x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Because results with NOPs were inconclusive, I tried testing with combinations of various instructions designed to dodge other resource limits. Mixing instructions that write to the integer and floating point registers showed X925 could have a maximum of 448 renamed registers allocated across its register files. Recognized zeroing idioms like MOV r,0 do not allocate an integer register, but also run up against the 448 instruction limit. I tried mixing in predicate register writes, but those also share the 448 instruction limit. Adding in stores showed the core could have slightly more than 525 instructions in flight. Adding in not-taken branches did not increase reordering capacity further. Putting an exact number on X925&#8217;s reorder buffer capacity is therefore difficult, but it&#8217;s safe to say there&#8217;s a practical limitation of around 525 instructions in flight. That puts it in the same neighborhood as Intel&#8217;s Lion Cove (576) and ahead of AMD&#8217;s Zen 5 (448).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OvIP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OvIP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 424w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 848w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 1272w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OvIP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png" width="889" height="318" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:318,&quot;width&quot;:889,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OvIP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 424w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 848w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 1272w, https://substackcdn.com/image/fetch/$s_!OvIP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8053fadb-5325-4bfe-ab4d-cb16783c9509_889x318.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>X925&#8217;s register files, memory ordering queues, and other resources have comparable capacity to those in Zen 5 and Lion Cove. The only weakness is 128-bit vector execution, with correspondingly wide register file entries. AMD and Intel&#8217;s big cores have wider vector registers, and more of them available for renaming.</p><h2>Execution Units and Schedulers</h2><p>Arm laid out Cortex X925&#8217;s integer side to deliver high throughput while controlling port count for both the integer register file and scheduling queues. Eight ALU ports and three branch units are distributed across four schedulers in a layout that maximizes symmetry for common ALU operations. All four schedulers have two ALU ports and 28 entries. Similarly, each scheduler has one multiply-capable ALU pipe. Branches and special integer operations see a split, with the first three schedulers getting a branch pipe and the fourth scheduler getting support for pointer authentication and SVE predicate operations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iAW_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iAW_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 424w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 848w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 1272w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iAW_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png" width="1003" height="580" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:580,&quot;width&quot;:1003,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iAW_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 424w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 848w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 1272w, https://substackcdn.com/image/fetch/$s_!iAW_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e374392-0d85-484f-ad4e-8ac55ab7541a_1003x580.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>The aarch64 instruction set has a madd instruction that performs integer multiply-adds. Cortex A725 and older Arm cores had dedicated integer multi-cycle pipes that could handle madd along with other complex integer instructions. Cortex X925 instead breaks madd into two micro-ops, and handles it with any of its four multiply-capable integer pipes. Likely, Arm wanted to increase throughput for that instruction without the cost of implementing three register file read ports for each multiply-capable pipe. Curiously, Arm&#8217;s optimization guide refers to the fourth scheduler&#8217;s pipes as &#8220;single/multi-cycle&#8221; pipes. &#8220;Multi-cycle&#8221; is now a misnomer though, because the core&#8217;s &#8220;single-cycle&#8221; integer pipes can handle multiplies, which have two cycle latency. On Cortex X925, &#8220;multi-cycle&#8221; pipes distinguish themselves by handling special operations and being able to access FP/vector related registers.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!adNz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!adNz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 424w, https://substackcdn.com/image/fetch/$s_!adNz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 848w, https://substackcdn.com/image/fetch/$s_!adNz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 1272w, https://substackcdn.com/image/fetch/$s_!adNz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!adNz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png" width="735" height="156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:156,&quot;width&quot;:735,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!adNz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 424w, https://substackcdn.com/image/fetch/$s_!adNz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 848w, https://substackcdn.com/image/fetch/$s_!adNz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 1272w, https://substackcdn.com/image/fetch/$s_!adNz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8f281f0-d9a3-48e8-8dbb-02b5a7f20a0d_735x156.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">The two multi-cycle pipes on the right can take three inputs each from predicate registers (two inputs, and a predicate as a mask), so treat the purple input arrow as three. Drawing more arrows is hard.</figcaption></figure></div><p>Thanks to symmetry across the integer schedulers, X925&#8217;s renamer likely uses a simple round-robin allocation scheme for operations that can go to multiple schedulers. If I test scheduler capacity by interleaving dependent and independent integer adds, X925 can only keep half as many dependent adds in flight. Following dependent adds by independent ones only slightly reduces measured scheduling capacity. That suggests the renamer assigns a scheduler for each pending operation, and stalls if the targeted scheduling queue is full without scanning other eligible schedulers for free entries.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pmVK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pmVK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 424w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 848w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 1272w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pmVK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png" width="1385" height="671" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:671,&quot;width&quot;:1385,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pmVK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 424w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 848w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 1272w, https://substackcdn.com/image/fetch/$s_!pmVK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe035a9a5-4af4-416c-a5fc-ad6b7f5ba537_1385x671.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Cortex X925&#8217;s FPU has six pipes, all of which can handle vector floating point adds, multiplies, and multiply-adds. All six pipes also support vector integer adds and multiplies. Less common instructions like addv are still serviced by four pipes. X925&#8217;s FP schedulers are impressively large with approximately 53 entries each. For perspective, each of X925&#8217;s three FP schedulers has nearly as much capacity as Bulldozer&#8217;s 60 entry unified FP scheduler. Bulldozer used that scheduler to service two threads, while X925 uses its three FP schedulers to service a single thread.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y3Tn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y3Tn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 424w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 848w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 1272w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y3Tn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png" width="886" height="407" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:407,&quot;width&quot;:886,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y3Tn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 424w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 848w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 1272w, https://substackcdn.com/image/fetch/$s_!y3Tn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5709bef-d514-4966-bfd0-2b0955d6dcd3_886x407.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>High scheduler capacity and high pipe count should give X925 good performance in vectorized applications despite its 128-bit vector width.</p><h2>Load/Store</h2><p>Memory accesses are among the most complicated and performance critical operations on a modern CPU. For each memory access, the load/store unit has to translate program-visible virtual addresses into physical addresses. It also has to determine whether loads should get data from an older store, or from the cache hierarchy. Cortex X925 has four address generation units that calculate virtual addresses. Two of those can handle stores.</p><p>Address translations are cached in a standard two-level TLB setup. The L1 DTLB has 96 entries and is fully associative. A 2048 entry 8-way L2 TLB handles larger data footprints, and adds 6 cycles of latency. Zen 5 for comparison has the same L1 DTLB capacity and associativity, but a larger 4096 entry L2 DTLB that adds 7 cycles of latency. Another difference is that Zen 5 has a separate L2 ITLB for instruction-side translations, while Cortex X925 uses a unified L2 TLB for both instructions and data. AMD&#8217;s approach could further increase TLB reach, because data and instructions often reside on different pages.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_3yK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_3yK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 424w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 848w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 1272w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_3yK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png" width="1456" height="650" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:650,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_3yK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 424w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 848w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 1272w, https://substackcdn.com/image/fetch/$s_!_3yK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e7aa016-1128-4fbb-acc9-849e15d4ca87_1600x714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Store forwarding latency for integer accesses, using the methodology described at <a href="https://blog.stuffedcow.net/2014/01/x86-memory-disambiguation/">https://blog.stuffedcow.net/2014/01/x86-memory-disambiguation/</a> (ported over to aarch64)</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4OFf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4OFf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 424w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 848w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 1272w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4OFf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png" width="1456" height="647" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:647,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4OFf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 424w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 848w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 1272w, https://substackcdn.com/image/fetch/$s_!4OFf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40ac015f-3e07-4503-a324-7c9e3b11f1dc_1600x711.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">And adapting it to use vector registers with 128-bit stores and 64-bit loads</figcaption></figure></div><p>Store forwarding on the integer side works for all loads contained within a prior store. It&#8217;s an improvement over prior arm cores like the Cortex X2, which could only forward either half of a 64-bit store to a 32-bit load. Forwarding on the FP/vector side still works like older Arm cores, and only works for specific load alignments with respect to the store address. Unlike recent Intel and AMD cores, Cortex X925 can&#8217;t do zero latency forwarding when store and load addresses match exactly. To summarize store forwarding behavior:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8HYb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8HYb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 424w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 848w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 1272w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8HYb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png" width="885" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9484922-604a-45a9-b817-b4aca98f9bca_885x270.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:885,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8HYb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 424w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 848w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 1272w, https://substackcdn.com/image/fetch/$s_!8HYb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9484922-604a-45a9-b817-b4aca98f9bca_885x270.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Memory dependencies can&#8217;t be definitively determined until address translation finishes. Some cores do an early check before address translation completes, using bits of the address that represent an offset into a page. Cortex X925 might be doing that because it takes a barely measurable penalty when loads and stores appear to overlap in the low 12 bits.</p><h1>Core-Private Caches</h1><p>Cortex X925 has a 64 KB L1 data cache with 4 cycle latency like A725 companions in GB10, but takes advantage of its larger power and area budget to make that capacity go further. It uses a more sophisticated re-reference interval prediction (RRIP) replacement policy rather than the pseudo-LRU policy used on A725. Bandwidth is higher too. Arm&#8217;s technical reference manual says the L1D has &#8220;4x128-bit read paths and 4x128-bit write paths&#8221;. Sustaining more than two stores per cycle is impossible because the core only has two store-capable AGUs. Loads can use all four AGUs, and can achieve 64B/cycle from the L1 data cache. That&#8217;s competitive against many AVX2-capable x86-64 CPUs from a few generations ago. However, more recent Intel and AMD cores can use their wider vector width and faster clocks to achieve much higher L1D bandwidth, even if they also have four AGUs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xnas!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xnas!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 424w, https://substackcdn.com/image/fetch/$s_!xnas!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 848w, https://substackcdn.com/image/fetch/$s_!xnas!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 1272w, https://substackcdn.com/image/fetch/$s_!xnas!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xnas!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png" width="1456" height="665" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:665,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xnas!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 424w, https://substackcdn.com/image/fetch/$s_!xnas!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 848w, https://substackcdn.com/image/fetch/$s_!xnas!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 1272w, https://substackcdn.com/image/fetch/$s_!xnas!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62dbd982-e8c3-47bf-af73-e0a5b8df7739_1600x731.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Arm offers 2 MB 8-way and 3 MB 12-way L2 cache options. Mediatek and Nvidia chose the 2 MB option, and testing shows it has 12 cycles of latency. THis low cycle count latency lets Arm remain competitive against Intel and AMD&#8217;s L2 caches, despite running at lower clock speeds. L2 bandwidth comes in at 32 bytes per cycle for reads, and increases to approximately 45 bytes per cycle with a read-modify-write pattern.</p><p>Like AMD, Arm makes the L2 strictly inclusive of the L1 data cache, which lets the L2 act as a snoop filter. If an incoming snoop misses in the L2, the core can be sure it won&#8217;t hit in the L1D either.</p><h1>Performance: SPEC CPU2017</h1><p>Cortex X925 turns in an excellent performance in SPEC CPU2017&#8217;s integer suite. Its score in the integer suite is within margin of error compared to Intel and AMD&#8217;s highest performance cores, in their highest performance desktop configurations. AMD&#8217;s Zen 5 still slips ahead in SPEC&#8217;s floating point suite, but not by a large margin. Running Zen 5 with faster DDR5-6000 instead of DDR5-5600 memory can slightly increase its integer score, but only to 11.9 - not enough to pull away from X925. Creating a high performance core involves making the right tradeoff between frequency and performance per clock, while keeping everything else in balance. It&#8217;s safe to say Arm has found a combination of frequency and performance per clock that lets them compete with AMD and Intel&#8217;s best.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bHDx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bHDx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 424w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 848w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 1272w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bHDx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png" width="1456" height="587" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:587,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bHDx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 424w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 848w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 1272w, https://substackcdn.com/image/fetch/$s_!bHDx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6817eaf9-e0d2-45d2-964a-0eb5c7059f7e_1532x618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Diving deeper into individual workloads shows a complex picture. X925 trades blows with the higher clocked Intel and AMD cores in core-bound workloads. 548.exchange2 and 500.perlbench both show the advantages of clock speed scaling, with Intel and AMD&#8217;s higher clocking 8-wide cores easily outpacing Arm&#8217;s 4 GHz 10-wide one. But 525.x264 turns things around. Cortex X925 is able to finish that workload with fewer instructions than its x86-64 peers, while maintaining a large IPC advantage. X925 continues to do well in workloads that challenge the branch predictor, like 541.leela and 505.mcf. Finally, memory bound tests like 520.omnetpp are heavily influenced by factors outside the core.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hcSn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hcSn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 424w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 848w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 1272w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hcSn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hcSn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 424w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 848w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 1272w, https://substackcdn.com/image/fetch/$s_!hcSn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3a4d-2226-4712-a987-8b05dc6fda56_1508x852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Performance monitoring counter data shows how Cortex X925 uses high IPC to make up for a clock speed deficit. Whether it&#8217;s enough IPC to match Lion Cove and Zen 5 varies depending on the individual test, but overall Arm&#8217;s chosen IPC and clock speed targets are just as viable as Intel&#8217;s and AMD&#8217;s.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!grPY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!grPY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 424w, https://substackcdn.com/image/fetch/$s_!grPY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 848w, https://substackcdn.com/image/fetch/$s_!grPY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 1272w, https://substackcdn.com/image/fetch/$s_!grPY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!grPY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png" width="1014" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1014,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!grPY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 424w, https://substackcdn.com/image/fetch/$s_!grPY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 848w, https://substackcdn.com/image/fetch/$s_!grPY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 1272w, https://substackcdn.com/image/fetch/$s_!grPY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3e1f5b2-26b8-49cf-9703-4e853e158d9a_1014x824.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>SPEC CPU2017&#8217;s floating point workloads throw a wrench into the works. Cortex X925 falls behind Zen 5 on enough tests to leave AMD&#8217;s latest and greatest core with a clear victory. In Arm&#8217;s favor, they are able to keep pace with Intel&#8217;s Lion Cove.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wVHt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wVHt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 424w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 848w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 1272w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wVHt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png" width="1456" height="1217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1217,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wVHt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 424w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 848w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 1272w, https://substackcdn.com/image/fetch/$s_!wVHt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90941f94-b47a-4d43-acce-d393cf78117a_1548x1294.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PMU data indicates X925 is able to maintain higher IPC relative to the competition. Unfortunately for X925, several tests require many more instructions to complete with the aarch64 instruction set compared to x86-64. Arm needs a large enough IPC advantage to overcome both a clock speed deficit and a less efficient representation of the work. Cortex X925 does achieve high IPC, but it&#8217;s not high enough.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UPRN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UPRN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 424w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 848w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 1272w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UPRN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png" width="1016" height="908" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:908,&quot;width&quot;:1016,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UPRN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 424w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 848w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 1272w, https://substackcdn.com/image/fetch/$s_!UPRN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8369f3e-7af2-4f88-899f-448edebf3796_1016x908.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Plotting instruction counts shows just how severe the situation can get for Cortex X925. 507.cactuBSSN, 521.wrf, 549.fotonik3d, and 554.roms all require more instructions on X925, and by no small margin. 554.roms is the worst offender, and makes X925 execute more than twice as many instructions compared to Zen 5. Average IPC in these four tests is nowhere near core width for any of these tested cores, but crunching through extra instructions isn&#8217;t the only issue. Higher instruction counts place more pressure on core out-of-order resources, impacting its ability to hide latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3sEO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3sEO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 424w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 848w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 1272w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3sEO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png" width="1251" height="1022" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1022,&quot;width&quot;:1251,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3sEO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 424w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 848w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 1272w, https://substackcdn.com/image/fetch/$s_!3sEO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe698e1d2-eeb4-4c32-a92d-121a78a43ec1_1251x1022.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><h1>Final Words</h1><p>Arm now has a core with enough performance to take on not only laptop, but also desktop use cases. They&#8217;ve also shown it&#8217;s possible to deliver that performance at a modest 4 GHz clock speed. Arm achieved that by executing well on the fundamentals throughout the core pipeline. X925&#8217;s branch predictor is fast and state-of-the-art. Its out-of-order execution engine is truly gargantuan. Penalties are few, and tradeoffs appear well considered. There aren&#8217;t a lot of companies out there capable of building a core with this level of performance, so Arm has plenty to be proud of.</p><p>That said, getting a high performance core is only one piece of the puzzle. Gaming workloads are very important in the consumer space, and benefit more from a strong memory subsystem than high core throughput. A DSU variant with L3 capacity options greater than 32 MB could help in that area. X86-64&#8217;s strong software ecosystem is another challenge to tackle. And finally, Arm still relies on its partners to carry out its vision. I look forward to seeing Arm take on all of these challenges, while also iterating on their core line to keep pace as AMD and Intel improve their cores. Hopefully, extra competition will make better, more affordable CPUs for all of us.</p>]]></content:encoded></item><item><title><![CDATA[Arm’s Cortex A725 ft. Dell’s Pro Max with GB10]]></title><description><![CDATA[Arm&#8217;s 7-series cores started out as the company&#8217;s highest performance offerings.]]></description><link>https://chipsandcheese.com/p/arms-cortex-a725-ft-dells-pro-max</link><guid isPermaLink="false">https://chipsandcheese.com/p/arms-cortex-a725-ft-dells-pro-max</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Tue, 27 Jan 2026 18:38:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nzio!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Arm&#8217;s 7-series cores started out as the company&#8217;s highest performance offerings. After Arm introduced the performance oriented X-series, 7-series cores have increasingly moved into a density focused role. Like Intel&#8217;s E-Cores, Arm&#8217;s 7-series cores today derive their strength from numbers. A hybrid core setup can ideally use density-optimized cores to achieve high multithreaded performance at lower power and area costs than a uniform big-core design. The Cortex A725 is the latest in Arm&#8217;s 7-series, and arrives when Arm definitely wants a strong density optimized core. Arm would like SoC makers to license their cores rather than make their own. Arm probably hopes their big+little combination can compare well with Qualcomm&#8217;s custom cores. Beyond that, Arm&#8217;s efforts to expand into the x86-64 dominated laptop segment would benefit from a good density optimized core too.</p><p>Here, I&#8217;ll be looking into the Cortex A725 in Nvidia&#8217;s GB10. GB10 has ten A725 cores and ten X925 cores split across two clusters, with five of each core type in each. The A725 cores run at 2.8 GHz, while the high performance X925 cores reach 3.9 to 4 GHz. As covered before, one of GB10&#8217;s clusters has 8 MB of L3, while the other has 16 MB. GB10 will provide a look at A725&#8217;s core architecture, though implementation choices will obviously influence performance. For comparison, I&#8217;ll use older cores from Arm&#8217;s 7-series line as well as Intel&#8217;s recent Skymont E-Core.</p><h1>Acknowledgements</h1><p>A massive thanks to Dell for sending over two of their <a href="https://www.dell.com/en-us/shop/desktop-computers/dell-pro-max-with-gb10/spd/dell-pro-max-fcm1253-micro">Pro Max with GB10</a> for testing. In our testing the <a href="https://www.dell.com/en-us/shop/desktop-computers/dell-pro-max-with-gb10/spd/dell-pro-max-fcm1253-micro">Dell Pro Maxs</a> were quite quiet even when under a Linpack load which speaks to Dell&#8217;s thermal design keeping the GB10 SoC within reasonable levels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nzio!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nzio!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 424w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 848w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 1272w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nzio!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png" width="748" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:748,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nzio!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 424w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 848w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 1272w, https://substackcdn.com/image/fetch/$s_!Nzio!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3019f6cd-994c-47c8-b2a9-52741b40158e_748x530.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The two <a href="https://www.dell.com/en-us/shop/desktop-computers/dell-pro-max-with-gb10/spd/dell-pro-max-fcm1253-micro">Dell Pro Max with GB10s</a> that Dell sent over stacked one on top of each other.</figcaption></figure></div><h1>Overview</h1><p>Cortex A725 is a 5-wide out-of-order core with reordering capacity roughly on par with Intel&#8217;s Skylake or AMD&#8217;s Zen 2. Its execution engine is therefore large enough to dip deep into the benefits of out-of-order execution, but stops well short of high performance core territory. The core connects to the rest of the system via Arm&#8217;s DynamIQ Shared Unit 120 (DSU-120) via 256-bit read and write paths. Arm&#8217;s DSU-120 acts as a cluster level interconnect and contains the shared L3 cache.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ru2v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ru2v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 424w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 848w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 1272w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ru2v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png" width="886" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:886,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ru2v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 424w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 848w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 1272w, https://substackcdn.com/image/fetch/$s_!ru2v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb91a0e4-2aa0-4633-88c8-9a757e0702c9_886x612.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In Arm tradition, A725 offers a number of configuration options to let implementers fine tune area, power, and performance. These options include:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wT3H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wT3H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 424w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 848w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 1272w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wT3H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png" width="1456" height="388" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/629af156-1fbd-41be-9f72-de7982652d96_1558x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:388,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wT3H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 424w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 848w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 1272w, https://substackcdn.com/image/fetch/$s_!wT3H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629af156-1fbd-41be-9f72-de7982652d96_1558x415.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Branch Predictor</h1><p>Cortex A710&#8217;s branch predictor already showed capabilities on par with older Intel high performance cores, and A725 inherits that strength. In a test where branch direction correlates with increasingly long random patterns, A725 trades blows with A715 rather than being an overall improvement. With a single branch, latencies start going up once the random pattern exceeds 2048 for a single branch, versus 3072 on A710. With 512 branches in play, both A710 and A725 see higher latencies once the pattern length exceeds 32. However, A725 shows a gradual degradation compared to the abrupt one seen on A710.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QGFB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QGFB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 424w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 848w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 1272w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QGFB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png" width="1456" height="661" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:661,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QGFB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 424w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 848w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 1272w, https://substackcdn.com/image/fetch/$s_!QGFB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3810838b-d6c5-4a4a-99fc-0ee32156d898_1476x670.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2bpL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2bpL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 424w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 848w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 1272w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2bpL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png" width="1160" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:1160,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;https://substackcdn.com/image/fetch/$s_!2bpL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://substackcdn.com/image/fetch/$s_!2bpL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png" title="https://substackcdn.com/image/fetch/$s_!2bpL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png" srcset="https://substackcdn.com/image/fetch/$s_!2bpL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 424w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 848w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 1272w, https://substackcdn.com/image/fetch/$s_!2bpL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c82e8c5-ef26-490f-83d9-cf0ec6ab4d93_1160x690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Arm&#8217;s 7-series cores historically used branch target buffers (BTBs) with very low cycle count latencies, which helped make up for their low clock speed. A725&#8217;s BTB setup is fine on its own, but regresses compared to A710. Each BTB level is smaller compared to A710, and the larger BTB levels are slower too.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t2Jx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t2Jx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 424w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 848w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 1272w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t2Jx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png" width="1447" height="705" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:705,&quot;width&quot;:1447,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t2Jx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 424w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 848w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 1272w, https://substackcdn.com/image/fetch/$s_!t2Jx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0690a4ac-dd23-4252-8b20-b875843df5d2_1447x705.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A725&#8217;s taken branch performance trades blows with Skymont. Tiny loops favor A725, because Skymont can&#8217;t do two taken branches per cycle. Moderate branch counts give Skymont a win, because Intel&#8217;s 1024 entry L1 BTB has single cycle latency. Getting a target out of A725&#8217;s 512 entry L1 BTB adds a pipeline bubble. Then, both cores have similar cycle count latencies to a 8192 entry L2 BTB. However, that favors Skymont because of its higher clock speed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!urvO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!urvO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 424w, https://substackcdn.com/image/fetch/$s_!urvO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 848w, https://substackcdn.com/image/fetch/$s_!urvO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 1272w, https://substackcdn.com/image/fetch/$s_!urvO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!urvO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png" width="1393" height="699" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:699,&quot;width&quot;:1393,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!urvO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 424w, https://substackcdn.com/image/fetch/$s_!urvO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 848w, https://substackcdn.com/image/fetch/$s_!urvO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 1272w, https://substackcdn.com/image/fetch/$s_!urvO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53003b81-680b-423d-95d6-49c3e0b1f660_1393x699.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Returns, or branch-with-link instructions in Arm terms, are predicted on A725 using a 16 entry return stack. A call+return pair has a latency of two cycles, so each branch likely executes with single cycle latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dv61!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dv61!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 424w, https://substackcdn.com/image/fetch/$s_!dv61!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 848w, https://substackcdn.com/image/fetch/$s_!dv61!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 1272w, https://substackcdn.com/image/fetch/$s_!dv61!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dv61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png" width="1456" height="803" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:803,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dv61!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 424w, https://substackcdn.com/image/fetch/$s_!dv61!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 848w, https://substackcdn.com/image/fetch/$s_!dv61!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 1272w, https://substackcdn.com/image/fetch/$s_!dv61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c6aebc-323d-4bb9-b6d3-2c1c17edf7ec_1488x821.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Arm cores tested with march=armv8 to make comparisons easier</figcaption></figure></div><p>In SPEC CPU2017, A725 holds the line in workloads that are easy on the branch predictor while sometimes delivering improvements in difficult tests. 541.leela is the most impressive case. In that workload. A725 manages a 6.61% decrease in mispredicts per instruction compared to Neoverse N2. Neoverse N2 is a close relative of Cortex A710, and was tested here in Azure&#8217;s Cobalt 100. A725&#8217;s gains are less clear in SPEC CPU2017&#8217;s floating point workloads, where it trades blows with Neoverse N2. Still, it&#8217;s an overall win for A725 because SPEC&#8217;s floating point workloads are less branchy and put more emphasis on core throughput.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SEbx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SEbx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 424w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 848w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SEbx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png" width="1456" height="1235" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1235,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SEbx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 424w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 848w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!SEbx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e03f3bf-ee70-40ee-bab8-021e9a6c9294_1486x1260.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Comparing with Skymont is more difficult because there&#8217;s no way to match the instruction stream due to ISA differences. It&#8217;s safe to say the two cores land in the same ballpark, and both have excellent branch predictors.</p><h1>Instruction Fetch and Decode</h1><p>After the branch predictor has decided where to go, the frontend has to fetch instructions and decode them into the core&#8217;s internal format. Arm&#8217;s cores decode instructions into macro-operations (MOPs), which can be split further into micro-operations (uOPs) in later pipeline stages. Older Arm cores could cache MOPs in a MOP cache, much like AMD and Intel&#8217;s high performance CPUs. A725 drops the MOP cache and relies exclusively on a conventional fetch and decode path for instruction delivery. The fetch and decode path can deliver 5 MOPs per cycle, which can correspond to 6 instructions if the frontend can fuse a pair of instructions into a single MOP. With a stream of NOPs, A725&#8217;s frontend delivers five instructions per cycle until test sizes spill out of the 32 entry instruction TLB. Frontend bandwidth then drops to under 2 IPC, and stays there until the test spills out of L3.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y3EY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y3EY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 424w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 848w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 1272w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y3EY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png" width="1456" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y3EY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 424w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 848w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 1272w, https://substackcdn.com/image/fetch/$s_!Y3EY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131667eb-8426-42b2-b2c9-d36c5c3889c6_1600x740.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For comparison, A710 could use its MOP cache to hit 10 IPC if each MOP corresponds to a fused instruction pair. That&#8217;s easy to see with a stream of NOPs, because A710 can fuse pairs of NOPs into a single MOP. A725&#8217;s decoders can no longer fuse pairs of adjacent NOPs, though they still support other fusion cases like CMP+branch.</p><p>Arm&#8217;s decision to ditch the MOP cache deserves discussion. I think it was long overdue, because Arm cores already took plenty of measures to mitigate decode costs. Chief among these is a predecode scheme. Instructions are partially decoded when they&#8217;re filled into the instruction cache. L1I caches on cores up to the Cortex A78 and Cortex X1 stored instructions in a<a href="https://developer.arm.com/documentation/ka001493/latest/"> 36 or 40-bit intermediate format</a>. It&#8217;s a compromise between area-hungry micro-op caching and fully decoding instructions with every fetch. Combining predecode and micro-op caching is overkill, especially on low clocked cores where timing should be easier. Seeing A710 hit ridiculous throughput in an instruction bandwidth microbenchmark was fun, but it&#8217;s a corner case. Arm likely determined that the core rarely encountered more than one instruction fusion candidate per cycle, and I suspect they&#8217;re right.</p><p><a href="https://developer.arm.com/documentation/107652/0002/AArch64-registers/AArch64-Generic-System-Control-registers-summary/IMP-ISIDE-DATA1-EL3--RAMINDEX-Instruction-Data-register-1?lang=en">A725&#8217;s Technical Reference Manual</a> suggests Arm uses a different predecode scheme now, with 5 bits of &#8220;sideband&#8221; predecode data stored alongside each 32-bit aarch64 instruction. Sideband bit patterns indicate a valid opcode. Older Arm CPUs used the intermediate format to deal with the large and non-contiguous undefined instruction space. A725&#8217;s sideband bits likely cover the same purpose, along with aiding the decoders for valid instructions.</p><h1>Rename and Allocate</h1><p>MOPs from the frontend go to the rename/allocate stage, which can handle five MOPs per cycle. Besides doing classical register renaming to remove false dependencies, modern CPUs can pull a variety of tricks to expose additional parallelism. A725 can recognize moving an immediate value of 0 to a register as a zeroing idiom, and doesn&#8217;t need to allocate a physical register to hold that instruction&#8217;s result. Move elimination is supported too, though A725 can&#8217;t eliminate chained MOVs across all renamer slots like AMD and Intel&#8217;s recent cores. In a test with independent MOVs, A725 allocates a physical register for every instruction. Arm&#8217;s move elimination possibly fails if the renamer encounters too many MOVs in the same cycle. That&#8217;s what happens on Intel&#8217;s Haswell, but I couldn&#8217;t verify it on A725 because Arm doesn&#8217;t have performance events to account for move elimination efficacy.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sng1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sng1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 424w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 848w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 1272w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sng1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png" width="1456" height="279" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:279,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sng1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 424w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 848w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 1272w, https://substackcdn.com/image/fetch/$s_!Sng1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21dfa210-676b-41f3-a9b6-0f67155ce05a_1562x299.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Memory renaming is another form of move elimination, where the CPU carries out zero-cycle store forwarding if both the load and store use the same address register. Intel first did this on Ice Lake, and AMD first did so on Zen 2. Skymont has memory renaming too. I tested for but didn&#8217;t observe memory renaming on A725.</p><h1>Out-of-Order Execution Engine</h1><p>Arm historically took a conservative approach to growing reordering capacity on their 7-series cores. A larger reordering window helps the core run further ahead of a stalled instruction to extract instruction level parallelism. Increasing the reordering window can improve performance, but runs into diminishing returns. A725 gets a significant jump in reorder buffer capacity, from 160 to 224 entries. Reordering capacity is limited by whichever structure fills first, so sizing up other structures is necessary to make the most of the ROB size increase.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HBgI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HBgI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 424w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 848w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 1272w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HBgI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png" width="1456" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HBgI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 424w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 848w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 1272w, https://substackcdn.com/image/fetch/$s_!HBgI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d54b21b-3258-4d19-bb62-cce13b28f936_1600x533.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A725 gets a larger integer register file and deeper memory ordering queues, but not all structures get the same treatment. The FP/vector register file has more entries, but has fewer renames available for 128-bit vector results. FP/vector register file entries are likely 64-bit now, compared to 128-bit from before. SVE predicate registers take a slight decrease too. Vector execution tends to be area hungry and only benefits a subset of applications, so Arm seems to be de-prioritizing vector execution to favor density.</p><h1>Instruction Scheduling and Execution</h1><p>A725&#8217;s execution pipe layout looks a lot like A710&#8217;s. Four pipes service integer operations, and are each fed by a 20 entry scheduling queue. Two separate ports handle branches. I didn&#8217;t measure an increase in scheduling capacity when mixing branches and integer adds, so the branch ports likely share scheduling queues with the ALU ports. On A710, one of the pipes could only handle multi-cycle operations like madd. That changes with A725, and all four pipes can handle simple, single cycle operations. Integer multiplication remains a strong point in Arm&#8217;s 7-series architecture. A725 can complete two integer multiplies per cycle, and achieves two cycle latency for integer multiplies.</p><p>Two pipes handle floating point and vector operations. Arm has used a pair of mostly symmetrical FP/vector pipes dating back to the Cortex A57. A725 is a very different core, but a dual pipe FPU still looks like a decent balance for a density optimized core. Each FP/vector pipe is fed by a 16 entry scheduling queue, and both share a non-scheduling queue with approximately 23 entries. A single pipe can use up to 18 entries in the non-scheduling queue after its scheduler fills.</p><p>Compared to A710, A725 shrinks the FP scheduling queues and compensates with a larger non-scheduling queue. A non-scheduling queue is simpler than a scheduling one because it doesn&#8217;t have to check whether operations are ready to execute. It&#8217;s simply an in-order queue that sends MOPs to the scheduler when entries free up. A725 can&#8217;t examine as many FP/vector MOPs for execution ready-ness as A710 can, but retains similar ability to look past FP/vector operations in the instruction stream to find other independent operations. Re-balancing entries between the schedulers and non-scheduling queue is likely a power and area optimization.</p><h1>Load/Store</h1><p>A725 has a triple AGU pipe setup for memory address calculation, and seems to feed each pipe with a 16 entry scheduling queue. All three AGUs can handle loads, and two of them can handle stores. Indexed addressing carries no latency penalty.</p><p>Addresses from the AGU pipes head to the load/store unit, which ensures proper ordering for memory operations and carries out address translation to support virtual memory. Like prior Arm cores, A725 has a fast path for forwarding either half of a 64-bit store to a subsequent 32-bit load. This fast forwarding has 5 cycle latency per dependent store+load pair. Any other alignment gives 11 cycle latency, likely with the CPU blocking the load until the store can retire.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eWQH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eWQH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 424w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 848w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 1272w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eWQH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png" width="1456" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eWQH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 424w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 848w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 1272w, https://substackcdn.com/image/fetch/$s_!eWQH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faea83fdd-1db3-4eac-937c-af1b1d816d97_1600x722.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A725 is more sensitive to store alignment than A710, and seems to handle stores in 32 byte blocks. Store throughput decreases when crossing a 32B boundary, and takes another hit if the store crosses a 64B boundary as well. A710 takes the same penalty as A725 does at a 64B boundary, and no penalty at a 32B boundary.</p><h2>Address Translation</h2><p>Arm grew the L1 DTLB from 32 entries in A710 to 48 entries in A725, and keeps it fully associative. DTLB coverage thus increases from 128 to 192 KB. Curiously, Arm did the reverse on the instruction side and shrank A710&#8217;s 48 entry instruction TLB to 32 entries on A725. Applications typically have larger data footprints than code footprints, so shuffling TLB entries to the data side could help performance. Intel&#8217;s Skymont takes the opposite approach, with a 48 entry DTLB and a 128 entry ITLB.</p><p>A725&#8217;s L2 TLB has 1536 entries and is 6-way set associative, or can have 1024 entries and be 4-way set associative in a reduced area configuration. The standard option is an upgrade over the 1024 entry 4-way structure in A710, while the area reduced one is no worse. Hitting the L2 TLB adds 5 cycles of latency over a L1 DTLB hit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5STw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5STw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 424w, https://substackcdn.com/image/fetch/$s_!5STw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 848w, https://substackcdn.com/image/fetch/$s_!5STw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 1272w, https://substackcdn.com/image/fetch/$s_!5STw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5STw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png" width="1456" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5STw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 424w, https://substackcdn.com/image/fetch/$s_!5STw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 848w, https://substackcdn.com/image/fetch/$s_!5STw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 1272w, https://substackcdn.com/image/fetch/$s_!5STw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad91c97-99a0-4b76-9699-381f07ae136b_1600x715.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Linearly traversing an array while monitoring performance events (L2D_TLB_REFILL)</figcaption></figure></div><p>Intel&#8217;s Skymont has a much larger 4096 entry L2 TLB, but A725 can make up some of the difference by tracking multiple 4 KB pages with a single TLB entry. Arm&#8217;s Technical Reference Manual describes a &#8220;MMU coalescing&#8221; setting that can be toggled via a control register. The register lists &#8220;8x32&#8221; and &#8220;4x2048&#8221; modes, but Arm doesn&#8217;t go into further detail about what they mean. Arm is most likely doing something similar to AMD&#8217;s &#8220;page smashing&#8221;. Cores in AMD&#8217;s Zen line can &#8220;smash&#8221; together pages that are consecutive in virtual and physical address spaces and have the same attributes. &#8220;Smashed&#8221; pages act like larger pages and opportunistically extend TLB coverage, while letting the CPU retain the flexibility and compatibility benefits of a smaller page size. &#8220;8x32&#8221; likely means A725 can combine eight consecutive 4 KB pages to let a TLB entry cover 32 KB. For comparison, AMD&#8217;s Zen 5 can combine four consecutive 4 KB pages to cover 16 KB.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vOM8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vOM8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 424w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 848w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 1272w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vOM8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png" width="1456" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vOM8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 424w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 848w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 1272w, https://substackcdn.com/image/fetch/$s_!vOM8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F793676bc-df54-413a-bf9a-f4b3f1b4647f_1600x537.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From Arm&#8217;s A725 Technical Reference Manual</figcaption></figure></div><h1>Core-Private Caching</h1><p>A725 has a 64 KB 4-way set associative data cache. It&#8217;s divided into 16 banks to service multiple accesses, and achieves 4 cycle load-to-use latency. The L1D uses a pseudo-LRU replacement scheme, and like the L1D on many other CPUs, is virtually indexed and physically tagged (VIPT). Bandwidth-wise, the L1D can service three loads per cycle, all of which can be 128-bit vector loads. Stores go through at two per cycle. Curiously, Arm&#8217;s Technical Reference Manual says the L1D has four 64-bit write paths. However, sustaining four store instructions per cycle would be impossible with only two store AGUs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T2eJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T2eJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 424w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 848w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 1272w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T2eJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png" width="1139" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1139,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T2eJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 424w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 848w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 1272w, https://substackcdn.com/image/fetch/$s_!T2eJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F795c1db8-3e0a-41ef-a887-697a5dac848f_1139x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The L2 cache is 8-way set associative, and has two banks. Nvidia selected the 512 KB option, which is the largest size that supports 9 cycle latency. Going to 1 MB would bump latency to 10 cycles, which is still quite good. Because L2 misses incur high latency costs, Arm uses a more sophisticated &#8220;dynamic biased&#8221; cache replacement policy. L2 bandwidth can reach 32 bytes per cycle, but curiously not with read accesses. With either a read-modify-write or write-only access pattern, A725 can sustain 32 bytes per cycle out to L3 sizes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FZdC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FZdC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 424w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 848w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 1272w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FZdC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png" width="636" height="499" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:636,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FZdC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 424w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 848w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 1272w, https://substackcdn.com/image/fetch/$s_!FZdC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a0da2be-80b7-4461-b643-28690c312ed9_636x499.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Arm&#8217;s optimization guide details A725&#8217;s cache performance characteristics. Latencies beyond L2 rely on assumptions that don&#8217;t match GB10&#8217;s design, but the figures do emphasize the expensive nature of core to core transfers. Reaching out to another core incurs a significant penalty at the DSU compared to getting data from L3, and takes several times longer than hitting in core-private caches. Another interesting detail is that hitting in a peer core&#8217;s L2 gives better latency than hitting in its L1. A725&#8217;s L2 is strictly inclusive of L1D contents, which means L2 tags could be used as a snoop filter for the L1D. Arm&#8217;s figures suggest the core first checks L2 when it receives a snoop, and only probes L1D after a L2 hit.</p><h1>Performance: SPEC CPU2017</h1><p>A725&#8217;s SPEC CPU2017 performance brackets Intel&#8217;s Crestmont as tested in their Meteor Lake mobile design. Neoverse N2 scores slightly higher. Core performance is heavily dependent on how it&#8217;s implemented, and low clock speeds in GB10&#8217;s implementation mean its A725 cores struggle to climb past older density optimized designs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zp_t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zp_t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 424w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 848w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 1272w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zp_t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png" width="1200" height="614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zp_t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 424w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 848w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 1272w, https://substackcdn.com/image/fetch/$s_!zp_t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5898bebc-eaec-4ff0-bb92-e357079f0439_1200x614.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C2Qe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C2Qe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 424w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 848w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C2Qe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png" width="1437" height="1109" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1109,&quot;width&quot;:1437,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C2Qe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 424w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 848w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!C2Qe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81de3799-b582-4b9b-99ae-ef883838bc1f_1437x1109.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!te5i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!te5i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 424w, https://substackcdn.com/image/fetch/$s_!te5i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 848w, https://substackcdn.com/image/fetch/$s_!te5i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 1272w, https://substackcdn.com/image/fetch/$s_!te5i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!te5i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png" width="1428" height="1434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1434,&quot;width&quot;:1428,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!te5i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 424w, https://substackcdn.com/image/fetch/$s_!te5i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 848w, https://substackcdn.com/image/fetch/$s_!te5i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 1272w, https://substackcdn.com/image/fetch/$s_!te5i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabc27401-bd68-4f30-8604-52f07bf8ba35_1428x1434.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Clock speed tends to heavily influence core-bound bound workloads. 548.exchange2 is one example. Excellent core-private cache hitrates mean IPC (instructions per cycle) doesn&#8217;t change with clock speed. A725 manages a 10.9% IPC increase over Neoverse N2, showing that Arm has improved their core architecture. However, Neoverse N2&#8217;s 3.4 GHz clock speed translates to a 17% performance advantage over A725. Crestmont presents a more difficult comparison because of different executed instruction counts. 548.exchange2 executed 1.93 trillion instructions on the x86-64 cores, but required 2.12 trillion on aarch64 cores. A725 still achieves higher clock normalized performance, but Cresmont&#8217;s 3.8 GHz clock speed puts it 14.5% ahead. Other high IPC, core bound workloads show a similar trend. Crestmont chokes on something in 538.imagick though.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BJVC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BJVC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 424w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 848w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 1272w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BJVC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png" width="1248" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BJVC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 424w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 848w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 1272w, https://substackcdn.com/image/fetch/$s_!BJVC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fe08a46-43f7-4bc0-b40a-b8e7133e9438_1248x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On the other hand, clock speed means little in workloads that suffer a lot of last level cache misses. DRAM doesn&#8217;t keep pace with clock speed increases, meaning that IPC gets lower as clock speed gets higher. A725 remains very competitive in those workloads despite its lower clock speed. In 520.omnetpp, it matches the Neoverse N2 and takes a 20% lead over Crestmont. A725 even gets close to Skymont&#8217;s performance in 549.fotonik3d.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EnxT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EnxT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 424w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 848w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 1272w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EnxT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png" width="1249" height="758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d781a79c-725b-499c-b946-2e4ca92226af_1249x758.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:758,&quot;width&quot;:1249,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EnxT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 424w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 848w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 1272w, https://substackcdn.com/image/fetch/$s_!EnxT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd781a79c-725b-499c-b946-2e4ca92226af_1249x758.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Final Words</h1><p>A725 only gains entries in the most important out-of-order structures compared to A710. Elsewhere, Arm preferred to rebalance resources, or even cut back on them. It&#8217;s impressive to see Arm trim the fat on a core that didn&#8217;t have a lot of fat to begin with, and it&#8217;s hard to argue with the results. A725&#8217;s architecture looks like an overall upgrade over A710 and Neoverse N2. I suspect A725 would come out top if both ran at the same clock speeds and were supported by identical memory subsystems. In GB10 though, A725&#8217;s cores don&#8217;t clock high enough to beat older designs.</p><p>Arm&#8217;s core design and Nvidia&#8217;s implementation choices contrast with AMD and Intel&#8217;s density optimized strategies. Intel&#8217;s Skymont is almost a P-Core, and only makes concessions in the most power and area hungry features like vector execution. AMD takes their high performance architecture and targets lower clock speeds, taking whatever density gains they can get in the process. Now that Arm&#8217;s cores are moving into higher performance devices like Nvidia&#8217;s GB10, it&#8217;ll be interesting to see how their density-optimized strategy plays out against Intel and AMD&#8217;s.</p><p>Again, a massive thank you to <a href="https://www.dell.com/">Dell</a> for sending over the two <a href="https://www.dell.com/en-us/blog/dell-pro-max-with-gb10-purpose-built-for-ai-developers/">Pro Max with GB10s</a> for testing. </p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese, also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[CES 2026: Taking the Lids off AMD's Venice and MI400 SoCs]]></title><description><![CDATA[Here at CES 2026, AMD showed off their upcoming Venice series of server CPUs and their upcoming MI400 series of datacenter accelerators.]]></description><link>https://chipsandcheese.com/p/ces-2026-taking-the-lids-off-amds</link><guid isPermaLink="false">https://chipsandcheese.com/p/ces-2026-taking-the-lids-off-amds</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Tue, 06 Jan 2026 21:32:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qd_2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks,</p><p>Here at CES 2026, AMD showed off their upcoming Venice series of server CPUs and their upcoming MI400 series of datacenter accelerators. AMD has talked about the specifications of both Venice and the MI400 series at their Advancing AI event back in June of 2025, but this is the first time AMD has shown off the silicon for both of product lines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qd_2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qd_2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 424w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 848w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 1272w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qd_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png" width="1456" height="835" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4e23adc-da36-42af-b245-11459f567dcc_1674x960.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:835,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:766677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/183673318?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qd_2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 424w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 848w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 1272w, https://substackcdn.com/image/fetch/$s_!qd_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e23adc-da36-42af-b245-11459f567dcc_1674x960.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with Venice, the first thing to notice is the packaging of the CCDs to the IO dies is different. Instead of using the organic substrate of the package to run the wires between the CCDs and the IO dies that AMD has used since EPYC Rome, Venice appears to be using a more advanced form of packaging similar to Strix Halo or MI250X. Another change is that Venice appears to have two IO dies instead of the single IO die that the prior EPYC CPUs had.</p><p>Venice has 8 CCDs each of which have 32 cores for a total of up to 256 cores per Venice package. Doing some measuring of each of the dies, you get that each CCD is approximately 165mm<sup>2 </sup>of N2 silicon. If AMD has stuck to 4MB of L3 per core than each of these CCDs have 32 Zen 6 cores and 128MB of L3 cache along with the die to die interface for the CCD &lt;-&gt; IO die communications. At approximately 165mm<sup>2</sup> per CCD, that would make a Zen 6 core plus the 4MB of L3 per core about 5mm<sup>2</sup> each which is similar to Zen 5&#8217;s approximately 5.34mm<sup>2</sup> on N3 when counting both the Zen 5 core and 4MB of L3 cache.</p><p>Moving to the IO dies, they each appear to be approximately 353mm<sup>2</sup> for a total of just over 700mm<sup>2</sup> of silicon dedicated for the IO dies. This is a massive increase from the approximately 400mm<sup>2</sup> that the prior EPYC CPUs dedicated for their IO dies. The two IO dies appear to be using an advanced packaging of some kind similar to the CCDs. Next to the IO dies appear to be 8 little dies, 4 on each side of the package, which are likely to either be structural silicon or deep trench capacitor dies meant to improve power delivery to the CCDs and IO dies.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ivUB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ivUB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 424w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 848w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 1272w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ivUB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png" width="704" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45dd5538-8896-4235-a618-546e3acdfa11_704x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:704,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:382099,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/183673318?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ivUB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 424w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 848w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 1272w, https://substackcdn.com/image/fetch/$s_!ivUB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45dd5538-8896-4235-a618-546e3acdfa11_704x690.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Shifting off of Venice and on to the MI400 accelerator, this is a massive package with 12 HBM4 dies and &#8220;twelve 2 nanometer and 3 nanometer compute and IO dies&#8221;. It appears as if there are two base dies just like MI350. But unlike MI350, there appears to also be two extra dies on the top and bottom of the base dies. These two extra dies are likely for off-package IO such as PCIe, UALink, etc.</p><p>Calculating the die sizes of the base dies and the IO dies, the die size of the base die is approximately 747mm<sup>2</sup> for each of the two base dies with the off-package IO dies each being approximately 220mm<sup>2</sup>.  As for the compute dies, while the packaging precludes any visual demarcation of the different compute dies, it is likely that there are 8 compute dies with 4 compute dies on each base die. So while we can&#8217;t figure out the exact die size of the compute dies, the maximum size is approximately 180mm<sup>2</sup>. The compute chiplet is likely in the 140mm<sup>2</sup> to 160mm<sup>2</sup> region but that is a best guess that will have to wait to be confirmed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w4k7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w4k7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w4k7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg" width="1456" height="2184" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2184,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4569856,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/183673318?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w4k7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w4k7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a409225-b74c-403c-a268-4d42ef2d8c5a_3984x5975.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The MI455X and Venice are the two SoCs that are going to be powering AMD&#8217;s Helios AI Rack but they aren&#8217;t the only new Zen 6 and MI400 series products that AMD announced at CES. AMD announced that there would be a third member of the MI400 family called the MI440X joining the MI430X and MI455X. The MI440X is designed to fit into the 8-way UBB boxes as a direct replacement for the MI300/350 series. </p><p>AMD also announced Venice-X which is likely is going to be a V-Cache version of Venice. This is interesting because not only did AMD skip Turin-X but if there is a 256 core version of Venice-X, then this would be the first time that a high core count CCD will have the ability to support a V-Cache die. If AMD sticks to the same ratio of base die cache to V-Cache die cache, then each 32 core CCD would have up to 384MB of L3 cache which equates to 3 Gigabytes of L3 cache across the chip.</p><p>Both Venice and the MI400 series are due to launch later this year and I can&#8217;t wait to learn more about the underlying architectures of both SoCs.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese, also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Diving into Qualcomm's Upcoming Adreno X2 GPU with Eric Demers]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/diving-into-qualcomms-upcoming-adreno</link><guid isPermaLink="false">https://chipsandcheese.com/p/diving-into-qualcomms-upcoming-adreno</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Sun, 04 Jan 2026 21:31:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/aI0ahNRHVig" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks, </p><p>Today we are talking about Qualcomm's upcoming X2 GPU architecture with Eric Demers, Qualcomm's GPU Team lead. We talk about the changes from the prior X1 generation of GPUs along with why some of the changes were made.</p><p>Hope y'all enjoy!</p><div id="youtube2-aI0ahNRHVig" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;aI0ahNRHVig&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/aI0ahNRHVig?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The transcript below has been edited for readability and conciseness.  <br><br>George: Hello you fine Internet Folks. We&#8217;re here in San Diego at Qualcomm headquarters at their first architecture day and I have with me Eric Demers. What do you do?</p><p>Eric: I am an employee here at Qualcomm, I lead our GPU team, that&#8217;s most of the hardware and some of the software. It&#8217;s all in one and it&#8217;s a great team we&#8217;ve put together and I have been here for fourteen years.</p><p>George: Quick 60 second boildown of what is new in the X2 GPU versus the X1 GPU.</p><p>Eric: Well, it&#8217;s the next generation for us, meaning that we looked at everything we were building on the X1 and we said &#8220;we can do one better&#8221;, and fundamentally improved the performance. And I shared with you that it is a noticeable improvement in performance but not necessarily the same amount of increase in power. So it is a very power-efficient core, so that you effectively are getting much more performance at a slightly higher power cost. As opposed to doubling the performance and doubling the power cost, so that&#8217;s one big improvement, second we wanted to be a full-featured DirectX 12.2 Ultimate part. So all the features of DirectX 12.2 which most GPUs on Windows already support, so we&#8217;re part of that gang.</p><p>George: And in terms of API, what will you support for the X2 GPU?</p><p>Eric: Obviously we&#8217;ll have DirectX 12.2 and all the DirectX versions behind that, so we&#8217;ll be fully compatible there. But we also plan to introduce native Vulkan 1.4 support. There&#8217;s a version of that which Windows supplies, but we&#8217;ll be supplying a native version that is the same codebase as we use for our other products. We&#8217;ll also be introducing native OpenCL 3.0 support, also as used by our other products. And then in the first quarter of 2026 we&#8217;d like to introduce SYCL support, and SYCL is a higher-end compute-focused API and shading language for a GPU. It&#8217;s an open standard, other companies support it, and it helps us attack some of the GPGPU use-cases that exist on Windows for Snapdragon.</p><p>George: Awesome, going into the architecture a little bit. With the X2 GPU, you now have what&#8217;s called HPM or High-Performance Memory? What exactly is that for and how is it set up in hardware?</p><p>Eric: So our GPUs have always had memory attached to them. We were a tiling architecture to start with on mobile which minimises memory bandwidth requirements. And how it does it is by keeping as much on chip as possible before you send it to DRAM. Well, we&#8217;ve taken that to the next level and said &#8220;let&#8217;s make it really big&#8221; and instead of tiling the screen just render it normally but render it to an on-chip SRAM that is big enough to actually hold that whole surface or most of that surface. So particularly for the X2 Extreme Edition we can do a QHD+ resolution or 1600p resolution, 2K resolutions, all natively on die and all that rendering, so the color ROPs, all the Z-buffer, all of that is done at full speed on die and doesn&#8217;t use any DRAM bandwidth. Frees that up for reading on the GPU, CPU, or NPU. It gives you a performance boost, but saving all that bandwidth gives you a power boost as well. So it&#8217;s a performance per Watt improvement for us. It isn&#8217;t just for rendering and is a general purpose large memory attached to your GPU, you can use it for compute. You could render to it and then do post-processing with it still in the HPM using your shaders. There&#8217;s all types of flexible use-cases that we&#8217;ve come up with or that we will come up with over time.</p><p>George: Awesome, going into the HPM, there&#8217;s 21 megabytes of HPM on an X2-90 GPU. How is that split up? Is it 5.25 MB per slice and that slice can only access that 5.25 MB or is it shared across the whole die?</p><p>Eric: So physically it is implemented per slice. So we have 5.25 MB per slice. But no, there&#8217;s a full crossbar because if you think about it as a frame buffer there&#8217;s abilities to go fetch out of anywhere. You have random access to the whole surface from any of the slices. We have a full crossbar at full bandwidth that allows the HPM to be used by any of the slices, even though it is physically implemented inside the slice.</p><p>George: And how is the HPM set up? Is it a cache, or is it a scratchpad?</p><p>Eric: So the answer to that is yes, it is under software control. You can allocate parts of it to be a cache. Up to 3 MB of color and Z-cache. Roughly 1/3rd to 2/3rds and that&#8217;s typically if you&#8217;re rendering to DRAM then you&#8217;ll use it as a cache. The cache there is for the granularity fetches so you&#8217;ll prefetch data and bring them on chip and decompress the data and pull it into the HPM and use it locally as much as you can then cache it back out when you evict that cache line. The rest of the HPM will then be reserved for software use. It could be used to store nothing, it could be used to store render targets, you could have multiple render targets, it could store just Z-buffers, just color buffers. It could store textures from a previous render that you&#8217;re going to use as part of the rendering. It&#8217;s really a general scratchpad in that sense that you can use as you see fit, from both an application and a driver standpoint.</p><p>George: What was the consideration to allow only three megabytes of the SRAM to be used as cache and leave the other 2.25 MB as a sort of scratchpad SRAM? Why not allow all of it to be cache?</p><p>Eric: Yeah, that&#8217;s actually a question that we asked ourselves internally. First of all, making everything cache with the granularity we have has a cost. Your tag memory, which is the memory that tells you what&#8217;s going on in your cache grows with the size of your cache. So there&#8217;s a cost for that so it&#8217;s not like it&#8217;s free and we could just throw it in. There&#8217;s actually a substantial cost because it also affects your timing because you have to do a tag lookup and the bigger your tag is the more complex the design has to be. So there&#8217;s a desired limit at which beyond it&#8217;s much more work and more cost to put in. But we looked at it and what we found is that if you hold the whole frame buffer in HPM, you get 100% efficiency. But as soon as you drop below holding all of it, your efficiency flattens out and your cache hit rate doesn&#8217;t get much worse unless the cache gets really small and it doesn&#8217;t get much better until you get to the full side. So there&#8217;s a plateau and what we found is that right at the bottom of that plateau is roughly the size that we planned. So it gets really good cache hit-rates and if we had made it twice as big it wouldn&#8217;t have been much better but it would have been costlier so we did the right trade-off on this particular design. It may be something we revisit in the future, particularly if the use-cases for HPM change but for now it&#8217;s the right balance for us.</p><p>George: And sort of digging even deeper into the slice architecture. Previously, a micro-SP which is part of the slice. There&#8217;s two micro-SPs per SP and then there&#8217;s two SPs per slice.</p><p>Eric: Yeah.</p><p>George: The micro-SP would issue wave128 instructions and now you have moved to wave64 but you still have 128 ALUs in a micro-SP. How are you dealing with only issuing 64 instructions per wave to 128 ALUs?</p><p>Eric: I am not sure this is very different from others, but we dual-issue. So we will dual-issue two waves at a time to keep all those ALUs busy. So in a way it&#8217;s just meaning that we&#8217;re able to deal with wave64 but we&#8217;re also more efficient because they are wave64. If you do a branch, the smaller the wave the less granularity loss you&#8217;ll have, the less of the potential branches you have to take. Wave64 is generally a more efficient use of resources than one large wave and in fact two waves is more efficient than one wave and so for us keeping more waves in flight is simply an efficiency improvement even if it doesn&#8217;t affect your peak throughput for example. But it comes with overhead, you have to have more context, more information, more meta information on the side to have two waves in flight. One of the consequences is that the GPRs our general purpose registers where the data for the waves is stored. We had to grow it roughly 30% from 96k to 128k. We grew that in part to have more waves, both to deal with the dual-issue but just having more waves is generally more efficient.</p><p>George: So how often are you seeing your GPU being able to dual-issue? How often are you seeing two waves go down at the same time?</p><p>Eric: All the time, it almost always operates that way. I guess there could be cases where there are bubbles in the shader execution where they are both waiting and in which case neither will run. You could have one that gets issued and is ready earlier, but generally we have so many waves in flight that there&#8217;s always space for two to run.</p><p>George: Okay.</p><p>Eric: It would really be the case if you had a lot of GPRs used by one wave and that we do not have enough, and normally we have enough for two of those but you might be throttling more often in those cases because you just don&#8217;t have enough waves to cover all the latency of the memory fetches. But that&#8217;s a fairly corner case of very very complex shaders that is not typical.</p><p>George: So the dual-issue mechanism doesn&#8217;t have very many restrictions on it?</p><p>Eric: No, no restrictions.</p><p>George: Cool, thank you so much for this interview. And my final question is, what is your favourite type of cheese?</p><p>Eric: That&#8217;s a good question. I love Brie, I will always love Brie &#8230; and Mozzarella curds. So, I grew up in Canada and poutine was a big thing and I freaking loved Mozzarella curds and it&#8217;s hard to find them. The only place I found them is Whole Foods sometimes, but Amazon, surprisingly, I have to go order them so that we can make home-made poutine.</p><p>George: What&#8217;s funny is Ian, who&#8217;s behind the camera. He and I, he&#8217;s Potato, I&#8217;m Cheese.</p><p>Eric: There you go.</p><p>George: We run a show called the Tech Poutine, and the guest is the Gravy.</p><p>Eric: Ah, there you go, and the gravy is available. Either Swiss chalet, or I forget the other one, but they&#8217;re all good. And Five Guys in San Diego makes the fries that are closest to the typical poutine fries. And I actually have Canadian friends that have gone to Costco and gotten the cheese there and then gone to Five Guys, gotten the fries there and made the sauce just to eat that.</p><p>George: Thank you so much for this interview.</p><p>Eric: You&#8217;re welcome.</p><p>George: If you like content like this, hit like and subscribe. It does help with the channel. If you would like a transcript of this, that will be on <a href="http://chipsancheese.com">chipsancheese.com</a> as well as links to the Patreon and PayPal down below in the description. Thank you so much Eric.</p><p>Eric: You&#8217;re welcome, thank you.</p><p>George: And have a good one, folks.</p>]]></content:encoded></item><item><title><![CDATA[Inside Nvidia GB10’s Memory Subsystem, from the CPU Side]]></title><description><![CDATA[GB10 is a collaboration between Nvidia and Mediatek that brings Nvidia&#8217;s Blackwell architecture into an integrated GPU.]]></description><link>https://chipsandcheese.com/p/inside-nvidia-gb10s-memory-subsystem</link><guid isPermaLink="false">https://chipsandcheese.com/p/inside-nvidia-gb10s-memory-subsystem</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Wed, 31 Dec 2025 10:30:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/77aee0bd-072d-4fc3-ac3f-bade2834fefc_512x339.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>GB10 is a collaboration between Nvidia and Mediatek that brings Nvidia&#8217;s Blackwell architecture into an integrated GPU. GB10&#8217;s GPU has 48 Blackwell SMs, matching the RTX 5070 in core count. The CPU side has 10 Cortex X925 and 10 Cortex A725 cores and is therefore quite powerful. Feeding all of that compute power requires a beefy memory subsystem, and can lead to difficult tradeoffs. Analyzing GB10&#8217;s memory subsystem from the CPU side will be the focus of this article. To keep article length manageable, I&#8217;ll further focus on Nvidia and Mediatek&#8217;s memory subsystems and design decisions. Core architecture and GB10&#8217;s GPU will be an exercise for another time.</p><h1>Acknowledgments</h1><p>We&#8217;d like to thank Zach at ZeroOne Technology for allowing us SSH access to his DGX Spark unit for CPU testing.</p><h1>SOC Layout</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OGNa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OGNa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 424w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 848w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OGNa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png" width="1456" height="925" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:925,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OGNa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 424w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 848w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!OGNa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbc6d88c-c362-4ea6-8a86-f677313342a1_1600x1016.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>CPU cores on GB10 are split into two clusters. Each cluster has five A725 cores and five X925 cores. Core numbering starts with the A725 cores within each cluster, and the two clusters come after each other. All of the A725 cores run at 2.8 GHz. X925 cores clock up to 3.9 GHz on the first cluster, and up to 4 GHz on the second.</p><h1>Cache and Memory Access</h1><p>Arm&#8217;s A725 and X925 have configurable cache capacities. GB10 opts for 64 KB L1 instruction and data caches on both cores. All A725 cores get 512 KB L2 caches, and all X925 cores get 2 MB of L2. A725&#8217;s L2 is 8-way set associative and offers latency at just 9 cycles. In actual time, that comes out to 3.2 nanoseconds and is good considering the low 2.8 GHz clock speed. However, L3 latency is poor at over 21 ns, or &gt;60 cycles.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1SpH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1SpH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1SpH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png" width="1456" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1SpH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!1SpH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc968adfe-bccb-4a3f-991c-91e9f07dae3e_1600x689.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Testing cores across both clusters indicates that the first CPU cluster has 8 MB of L3, while the second has 16 MB. I&#8217;ll refer to these as Cluster 0 and Cluster 1 respectively. Both clusters have the same L3 latency from an A725 core, despite the capacity difference. 512 KB isn&#8217;t a lot of L2 capacity when L3 latency is this high. Likely, selecting the 512 KB L2 option reduces core area and lets GB10 implement more cores. Doing so makes sense considering that A725 cores aren&#8217;t meant to individually deliver high single threaded performance. That task is best left to the X925 cores.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ekGi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ekGi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ekGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png" width="1456" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ekGi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!ekGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce17a01b-c3ed-42cd-9337-abaa9545d161_1600x689.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GB10&#8217;s X925 cores have 2 MB, 8-way set associative L2 caches with 12 cycle latency. L3 latency is surprisingly much better at ~56 cycles or ~14 ns, even though the A725 and X925 cores share the same L3. While it&#8217;s not a spectacular L3 latency result, it&#8217;s at least similar to Intel&#8217;s Arrow Lake L3 in nanosecond terms. Combined with the larger L2, that gives GB10&#8217;s X925 cores a cache setup that&#8217;s better balanced to deliver high performance.</p><p>A 16 MB system level cache (SLC) sits after L3. It&#8217;s hard to see from latency plots due to its small capacity relative to the CPU L3 caches. Latency data from cluster 0 suggests SLC latency is around 42 or 47 ns, depending on whether it&#8217;s accessed from a X925 or A725 core respectively. System level caches aren&#8217;t tightly coupled to any compute block, which typically means lower performance in exchange for being able to service many blocks across the chip. Nvidia states that the system level cache &#8220;enables power-efficient data sharing between engines&#8221; in addition to serving as the CPU&#8217;s L4 cache. Allowing data exchange between the CPU and GPU without a round trip to DRAM may well be the SLC&#8217;s most important function.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ds8p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ds8p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 424w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 848w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 1272w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ds8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png" width="647" height="174" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:174,&quot;width&quot;:647,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ds8p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 424w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 848w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 1272w, https://substackcdn.com/image/fetch/$s_!ds8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2febfce0-d7a7-409c-9755-8d1d94df1bcb_647x174.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>AMD&#8217;s Zen 5 in Strix Halo has smaller but faster core-private caches. GB10&#8217;s X925 and A725 cores have good cycle count latencies, but Zen 5 can clock so much higher that its caches end up being faster, if just barely so at L2. AMD&#8217;s L3 design continues to impress, delivering lower latency even though it has twice as much capacity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!80cv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!80cv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!80cv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!80cv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!80cv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!80cv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png" width="1456" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!80cv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 424w, https://substackcdn.com/image/fetch/$s_!80cv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 848w, https://substackcdn.com/image/fetch/$s_!80cv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 1272w, https://substackcdn.com/image/fetch/$s_!80cv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff599c79b-5efc-41ad-a95b-2a8eae942892_1600x689.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>DRAM latency is a bright spot for GB10. 113 ns might feel slow coming from a typical DDR5-equipped desktop, but it&#8217;s excellent for LPDDR5X. Strix Halo for comparison has over 140 ns of DRAM latency, as does Intel&#8217;s Meteor Lake. Faster LPDDR5X may play a role in GB10&#8217;s latency figures. Hot Chips slides say GB10 can run its memory bus at up to 9400 MT/s, and dmidecode reports 8533 MT/s. Placing CPU cores on the same die as the memory controllers may also contribute to lower latency.</p><h2>Bandwidth</h2><p>Core-private bandwidth figures are straightforward. A725 cores can read from L1 at 48 bytes per cycle, and appear to have a 32B/cycle datapath to L2. A single A725 core can read from L3 at ~55 GB/s. X925 has more impressive bandwidth. It can read 64B/cycle from L1D, likely has a 64B/cycle path to L2, and can sustain nearly 90 GB/s of read bandwidth from L3. Single core DRAM bandwidth is also higher from a X925 core, at 38 GB/s compared to 26 GB/s from an A725 core. </p><p>A single AMD Zen 5 or Zen 4 core can pull over 50 GB/s from DRAM, or well north of 100 GB/s from L3. It&#8217;s an interesting difference that suggests AMD lets a single core queue up more memory requests, but low threaded workloads rarely demand that much bandwidth and I suspect it doesn&#8217;t make a big difference.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D4fc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D4fc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 424w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 848w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 1272w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D4fc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png" width="1456" height="685" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:685,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D4fc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 424w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 848w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 1272w, https://substackcdn.com/image/fetch/$s_!D4fc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81d46f5a-ac08-4652-a152-34fb86aee471_1600x753.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Shared components in the memory hierarchy face more pressure in multithreaded workloads, because having more cores active tends to multiply bandwidth demands. Normally I test multithreaded bandwidth by having each thread traverse a separate array. That prevents access combining, because no two threads will request the same address. It also shows the sum of cache capacities, because each core can keep a different part of the test data footprint in its private caches. GB10 has 15 MB of L2 across cores in each cluster, but only 8 or 16 MB of L3. Any size that fits within L3 will have a substantial part contained within L2.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VQqK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VQqK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 424w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 848w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 1272w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VQqK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png" width="1456" height="551" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VQqK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 424w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 848w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 1272w, https://substackcdn.com/image/fetch/$s_!VQqK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f919f7e-06c6-446b-9610-d1058874adac_1600x606.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mBWw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mBWw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 424w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 848w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 1272w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mBWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png" width="1456" height="670" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:670,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mBWw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 424w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 848w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 1272w, https://substackcdn.com/image/fetch/$s_!mBWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86091e62-c57b-4a1f-9d92-6ad6452216e1_1600x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pointing all threads to the same array carries the risk of overestimating bandwidth if accesses get combined, but that seems to happen only after a shared cache. I initially validated bandwidth testing methodologies on Zen 2 and Skylake. There, shared array results generally aligned with L3 performance counter data. Using shared array results on GB10 provides another data point to evaluate L3 performance. Taken together with results using thread-private arrays, they suggest GB10 has much lower L3 bandwidth than AMD&#8217;s Strix Halo. However, it&#8217;s still respectable at north of 200 GB/s and likely to be adequate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f-OD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f-OD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 424w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 848w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 1272w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f-OD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png" width="853" height="271" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:271,&quot;width&quot;:853,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f-OD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 424w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 848w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 1272w, https://substackcdn.com/image/fetch/$s_!f-OD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb75a6ac-5daa-4875-ae86-4f73fe4a1319_853x271.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GB10&#8217;s two CPU clusters have asymmetrical external bandwidth, in addition to L3 capacity. Cluster 0 feels a bit like a Strix Halo CCX (Core Complex). Cluster 1 gives off AMD GMI-Wide vibes, with over 100 GB/s of read bandwidth. Switching to a 1:1 ratio of reads and writes dramatically increases measured bandwidth, suggesting the clusters have independent read and write paths with similar width. GB10&#8217;s CPU clusters are built using <a href="https://developer.arm.com/documentation/102547/0201/The-DynamIQ-Shared-Unit-120/DynamIQ-Shared-Unit-120-configuration-options?lang=en">Arm&#8217;s DynamIQ Shared Unit 120 </a>(DSU-120), which can be configured with up to four 256-bit CHI interfaces, so perhaps two clusters have different interface counts. </p><p>Much like Strix Halo, GB10&#8217;s CPU side enjoys more bandwidth than a typical client setup, but can&#8217;t fully utilize the 256-bit LPDDR5X bus. CPU workloads tend to be more latency sensitive and less bandwidth hungry. Memory subsystems in both large iGPU chips reflect that, and emphasize caching to improve CPU performance.</p><h3>A Heterogeneous Cluster Configuration?</h3><p>Observations above point to Cluster 1 being performance optimized, while Cluster 0 focuses on density. Cache is one of the biggest area consumers in a modern chip so cutting L3 capacity to 8 MB is almost certainly an area concession. Cluster 0 may also have a narrower external interface as running fewer wires outside the cluster could save area as well. But Nvidia and Mediatek stop short of fully specializing each cluster.</p><p>Both Cluster 0 and Cluster 1 have the same core configuration of five X925 cores and five A725 cores. The X925 cores focus on the highest performance whereas the A725 cores focus on density. As a result the A725 cores feel out of place on a high performance cluster, especially with 512 KB of L2 in front of a L3 with over 20 ns of latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D1eb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D1eb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 424w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 848w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 1272w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D1eb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png" width="1456" height="773" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:773,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D1eb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 424w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 848w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 1272w, https://substackcdn.com/image/fetch/$s_!D1eb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc9cdcd-dd0b-4126-b6d7-089818ea8c34_1600x849.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Both lscpu and microbenchmark results agree that only the second cluster has a 16 MB L3 cache</figcaption></figure></div><p>I wonder if going all-in on cluster specialization would be a better idea by concentrating the ten A725 cores into Cluster 0 for density and the ten X925 cores into Cluster 1 for high performance. Going from two heterogeneous clusters to two homogeneous clusters would simplify the OS scheduler as well. For example, the OS scheduler would also have an easier time containing workloads to a single cluster, letting the hardware clock or power down the second cluster. </p><h2>Latency under Bandwidth Load</h2><p>Latency and bandwidth can go hand in hand. High bandwidth demands cause requests to back up in various queues throughout the memory subsystem, pushing up average request latency. Ideally, a memory subsystem can provide high bandwidth while preventing bandwidth hungry threads from starving out latency sensitive ones. Here, I&#8217;m testing latency from a X925 core with various combinations of other cores generating bandwidth load from within the same cluster.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nBbg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nBbg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 424w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 848w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 1272w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nBbg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png" width="1181" height="576" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:1181,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nBbg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 424w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 848w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 1272w, https://substackcdn.com/image/fetch/$s_!nBbg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8402694d-6c22-47c7-84aa-0fa5e5fdf59e_1181x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Both clusters hit their maximum bandwidth figures with all A725 generating bandwidth load. Adding bandwidth demands from X925 cores decreases aggregate bandwidth while pushing latency up. Reversing the core load order suggests the X925 cores specifically cause contention in the memory subsystem. Latency reaches a maximum with four X925 cores asking for as much bandwidth they can get. It&#8217;s almost like the X925 cores don&#8217;t know when to slow down to avoid monopolizing memory subsystem resources. When the A725 cores come into play, GB10 seems to realize what&#8217;s going on and starts to balance bandwidth demands better. Bandwidth improves, and surprisingly, latency does too.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0t3e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0t3e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 424w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 848w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 1272w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0t3e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png" width="1181" height="577" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:577,&quot;width&quot;:1181,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0t3e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 424w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 848w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 1272w, https://substackcdn.com/image/fetch/$s_!0t3e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F665ef4f2-841b-4cd7-9ce7-86cc4dbd767c_1181x577.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cluster 1 is worse at controlling latency despite having more bandwidth on tap. It&#8217;s unexpected after testing AMD&#8217;s GMI-Wide setup, where higher off-cluster bandwidth translated to better latency control under high bandwidth load.</p><p>Loading cores across both clusters shows GB10 maintaining lower latency than Strix Halo over the achieved bandwidth range. GB10&#8217;s combination of lower baseline latency and high external bandwidth from Cluster 1 put it well ahead.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gOfH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gOfH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 424w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 848w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 1272w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gOfH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png" width="1238" height="735" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:735,&quot;width&quot;:1238,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gOfH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 424w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 848w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 1272w, https://substackcdn.com/image/fetch/$s_!gOfH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc206f5d7-d764-4be6-9182-e46e013faba5_1238x735.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Throwing GB10&#8217;s iGPU into the mix presents an additional challenge. Increasing bandwidth demands from the iGPU drive up CPU-side latency, much like on Strix Halo. GB10 enjoys better baseline DRAM latency than Strix Halo, and maintains better latency at modest GPU bandwidth demands.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7f7W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7f7W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 424w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 848w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 1272w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7f7W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png" width="1456" height="708" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:708,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7f7W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 424w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 848w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 1272w, https://substackcdn.com/image/fetch/$s_!7f7W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17c313bb-00b3-45a8-a6a1-4242d3ca5fb2_1497x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>However, GB10 does let high GPU bandwidth demands squeeze out the CPU. Latency from the CPU side goes beyond 351 ns with the GPU pulling 231 GB/s.</p><p>I only generate bandwidth load from the GPU in the test above, and run a single CPU latency test thread. Mixing high CPU and GPU bandwidth demands makes the situation more complicated.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XOSw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XOSw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 424w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 848w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 1272w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XOSw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png" width="1413" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:1413,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XOSw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 424w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 848w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 1272w, https://substackcdn.com/image/fetch/$s_!XOSw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda74f114-d3ff-4804-aa9f-e5d4d0bf4f5f_1413x733.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With two X925 cores on cluster 1 pulling as much bandwidth as they can get their hands on, and the GPU doing the same, latency from the highest performance X925 core goes up to nearly 400 ns. Splitting out achieved bandwidth from the CPU and GPU further shows the GPU squeezing out the CPU bandwidth test threads.</p><h1>Core to Core Latency</h1><p>Memory accesses typically traverse the cache hierarchy in a vertical fashion, where cache misses at one level go through to a lower level. However, the memory subsystem may have to carry out transfers between caches at the same level in order to maintain cache coherency. Doing so can be rather complex. The memory subsystem has to determine which peer cache, if any, might have a more up-to-date copy of a line. Arm&#8217;s DSU-120 has a Snoop Control Unit, which uses snoop filters to orchestrate peer-to-peer cache transfers within a core complex. Nvidia/Mediatek&#8217;s High Performance Coherent Fabric is responsible for maintaining coherency across clusters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A5OR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A5OR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 424w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 848w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 1272w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A5OR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png" width="1456" height="701" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:701,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A5OR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 424w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 848w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 1272w, https://substackcdn.com/image/fetch/$s_!A5OR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99c6035b-23f2-401d-8ee1-582b1c7424da_1555x749.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GB10&#8217;s cluster boundaries are clearly visible with a coloring scheme applied across all result points. Results within each cluster are also far from uniform. Setting separate color schemes for intra-cluster and cross-cluster points highlights this. X925 cores in general give better intra-cluster latency results. Best case latencies involve transfers between X925 cores in the same cluster. Worst case latencies show up between A725 cores on different clusters, and can reach 240 ns.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LvCb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LvCb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 424w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 848w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 1272w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LvCb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png" width="1456" height="701" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:701,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LvCb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 424w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 848w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 1272w, https://substackcdn.com/image/fetch/$s_!LvCb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7bd7a5f-65d0-4953-96e7-8c996ca18141_1555x749.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Compared to Strix Halo, GB10&#8217;s core-to-core latency figures are high overall. Strix Halo manages to keep cross-cluster latencies at around 100 ns. It&#8217;s worse than what AMD achieves on desktop parts, but is substantially better than the 200 ns seen on GB10. Within clusters, AMD keeps everything below 50 ns while GB10 only manages 50-60 ns in the best case.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K8JY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K8JY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 424w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 848w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 1272w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K8JY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png" width="1456" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K8JY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 424w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 848w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 1272w, https://substackcdn.com/image/fetch/$s_!K8JY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F873abc2c-03b9-47e3-a223-6b838a8146b3_1600x758.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Final Words</h1><p>GB10&#8217;s CPU setup feels very density optimized compared to Strix Halo&#8217;s.GB10 has 20 CPU cores to Strix Halo&#8217;s 16, and gets there using a highly heterogeneous CPU configuration that&#8217;s light on cache. My initial thought is that all else being equal, I would prefer 32 MB of fast cache in a single level over 16 MB of L3 and 16 MB of slower system level cache. That said, performance is a complicated topic and I&#8217;m still working on benchmarking both Strix Halo and GB10. GB10&#8217;s memory subsystem has bright points too. Its DRAM latency is outstanding for a LPDDR5X implementation. Mediatek has also seen fit to give one cluster over 100 GB/s of external read bandwidth, which is something AMD hasn&#8217;t done on any client design to date.</p><p>CPU-side bandwidth is another interesting detail, and has a lot of common traits across both chips. CPU cores can&#8217;t access full LPDDR5X bandwidth on both GB10 and Strix Halo. The 256-bit memory bus is aimed at feeding the GPU, not the CPU. High GPU bandwidth demands can squeeze out the CPU in both memory subsystems. Perhaps Nvidia/Meditek and AMD both optimized for workloads that don&#8217;t simultaneously demand high CPU and GPU performance.</p><p>I hope to see Nvidia and AMD continue to iterate on large iGPU designs. Products like GB10 and Strix Halo allow smaller form factors, and sidestep VRAM capacity issues that plague current discrete GPUs. They&#8217;re fascinating from an enthusiast point of view. Hopefully both companies will improve their designs and make them more affordable going forward.</p><p>Again, we&#8217;d like to thank Zach at ZeroOne for providing SSH access to his Spark. If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese, also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>. </p>]]></content:encoded></item><item><title><![CDATA[SC25: The Present and Future of HPC Networking with Cornelis Networks CEO Lisa Spelman]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/sc25-the-present-and-future-of-hpc</link><guid isPermaLink="false">https://chipsandcheese.com/p/sc25-the-present-and-future-of-hpc</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Fri, 19 Dec 2025 22:21:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/_kIDsCyTjiA" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks,<br><br>Today we have an interview with the CEO of Cornelis Networks, Lisa Spelman, where we talk about what makes Omnipath different to other solutions on the market along with what steps has Cornelis taken in support of Ultra Ethernet.</p><p>Hope y&#8217;all enjoy!</p><div id="youtube2-_kIDsCyTjiA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;_kIDsCyTjiA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/_kIDsCyTjiA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p><p>This transcript has been lightly edited for readability and conciseness.  </p><p>George: Hello, you fine internet folks. We&#8217;re here at Supercomputing 2025 at the Cornelis Networks booth. So, with me I have, Lisa Spelman, CEO of Cornelis. Would you like to tell us about what Cornelis is and what you guys do?</p><p>Lisa: I would love to! So, thanks for having me, it&#8217;s always good, the only thing we&#8217;re missing is some cheese. We got lots of cheese.</p><p>George: There was cheese last night!</p><p>Lisa: Oh, man! Okay, well, yeah, we were here at the kickoff last night. It was a fun opening. So, Cornelis Networks is a company that is laser-focused on delivering the highest-performance networking solutions for the highest-performance applications in your data center. So that&#8217;s your HPC workloads, your AI workloads, and everything that just has intense data demands and benefits a lot from a parallel processing type of use case.</p><p>So that&#8217;s where all of our architecture, all of our differentiation, all of our work goes into.</p><p>George: Awesome. So, Cornelis Networks has their own networking, called OmniPath.</p><p>Lisa: Yes.</p><p>George: Now, some of you may know OmniPath used to be an Intel technology. But Cornelis, I believe, bought the IP from Intel. So could you go into a little bit about what OmniPath is and the difference between OmniPath, Ethernet, and InfiniBand.</p><p>Lisa: Yes, we can do that. So you&#8217;re right, Cornelis spun out of Intel with an OmniPath architecture.</p><p>George: Okay.</p><p>Lisa: And so this OmniPath architecture, I should maybe share, too, that we&#8217;re a full-stack, full-solution company. So we design both a NIC, a SuperNIC ASIC, we design a Switch ASIC... Look at you- He&#8217;s so good! He&#8217;s ready to go!</p><p>George: I have the showcases!</p><p>Lisa: Okay, so we have... we design our SuperNIC ASIC, we design our Switch ASIC, we design the card for the add-in card for the SuperNIC and the switchboard, all the way up, you know, top of rack, as well as all the way up to a big old director-class system that we have sitting here.</p><p>All of that is based on our OmniPath architecture, which was incubated and built at Intel then, like you said, spun out and acquired by Cornelis. So the foundational element of the OmniPath architecture is this lossless and congestion-free design.</p><p>So it was... it was built, you know, in the last decade focused on, how do you take all of the data movement happening in highly-parallel workloads and bring together a congestion-free environment that does not lose packets? So it was specifically built to address these modern workloads, the growth of AI, the massive scale of data being put together while letting go of learning from the past, but letting go of legacy, other networks that may be designed more for storage systems or for... you know, just other use cases. That&#8217;s not what they were inherently designed for.</p><p>George: The Internet.</p><p>Lisa: Yeah.</p><p>George: Such as Ethernet.</p><p>Lisa: A more modern development. I mean, Ethernet, I mean, amazing, right? But it&#8217;s 50 years old now. So what we did was, in this architecture, built in some really advanced capabilities and features like your credit-based flow controls and your dynamic lane scaling. So it&#8217;s the performance, as well as adding reliability to the system. And so the network plays a huge role, not only increasing your compute utilization of your GPU or your CPU, but it also can play a really big role in increasing the uptime of your overall system, which has huge economic value.</p><p>George: Yeah.</p><p>Lisa: So that&#8217;s the OmniPath architecture, and the way that it comes to life, and the way that people experience it, is lowest latency in the industry on all of these workloads. You know, we made sure on micro benchmarks, like ping pong latency, all the good micros. And then the highest message rates in the industry. We have two and a half X higher than the closest competitor. So that works really great for those, you know, message-dependent, really fast-rate workloads.</p><p>And then on top of that, we&#8217;re all going to operate at the same bandwidth. I mean, bandwidth is not really a differentiator anymore. And so we measure ourselves on how fast we get to half-bandwidth, and how quickly we can launch and start the data movement, the packet movement. So fastest to half-bandwidth is one of our points of pride for the architecture as well.</p><p>George: So, speaking of sort of Ethernet. I know there&#8217;s been the new UltraEthernet consortium in order to update Ethernet to a more... to the standard of today.</p><p>Lisa: Yeah.</p><p>George: What has Cornelis done to support that, especially with some of your technologies?</p><p>Lisa: So we think this move to UltraEthernet is really exciting for the industry. And it was obviously time, I mean, you know, it takes a big need and requirement to get this many companies to work together, and kind of put aside some differences, and come together to come up with a consortium and a capability, a definition that actually does serve the workloads of today.</p><p>So we&#8217;re we&#8217;re very excited and motivated towards it. And the reason that we are so is because we see so much of what we&#8217;ve already built in OmniPath being reflected through in the requirements of UltraEthernet. We also have a little bit of a point of pride in that the the software standard for UltraEthernet is built on top of LibFabric, which is an open source, you know, that we developed actually, and we&#8217;re maintainers of. So we&#8217;re we&#8217;re all in on the UltraEthernet, and in fact, we&#8217;ve just announced our next generation of products.</p><p>George: Speaking of; the CN6000, the successor to the CN5000: what exactly does it support in terms of networking protocols, and what do you sort of see in terms of like industry uptick?</p><p>Lisa: Yes. So this is really cool. We think it&#8217;s going to be super valuable for our customers. So with our next generation, our CN6000, that&#8217;s our 800 gig product, that one is going to be a multi-protocol NIC. So our super NIC there, it will have OmniPath native in it, and we have customers that they absolutely want that highest performance that they can get through OmniPath and it works great for them. But we&#8217;re also adding into it Rocky V2. So Ethernet performance as well as UltraEthernet, the 1.0, you know, the spec, the UltraEthernet compliance as well.</p><p>So you&#8217;re going to get this multi-modal NIC, and what we what we&#8217;re doing, what our differentiation is, is that you&#8217;re you&#8217;re moving to that Ethernet transport layer, but you&#8217;re still behind the scenes getting the benefits of the OmniPath architecture.</p><p>George: OK.</p><p>Lisa: So it&#8217;s not like it&#8217;s two totally separate things. We&#8217;re actually going to take your packet, run it through the OmniPath pipes and that architecture benefit, but spit it out as Ethernet, as a protocol or the transport layer.</p><p>George: Cool. And for UltraEthernet, I know there&#8217;s sort of two sort of specs. There&#8217;s what&#8217;s sort of colloquially known as the &#8220;AI spec&#8221; and the &#8220;HPC spec&#8221;, which have different requirements. For the CN6000, will it be sort of the AI spec or the HPC spec?</p><p>Lisa: Yeah. So we&#8217;re focusing on making sure the UltraEthernet transport layer absolutely works. But we are absolutely intending to deliver to both HPC performance and AI workload performance. And one of the things I like to kind of point out is, it&#8217;s not that they&#8217;re so different; AI workloads and HPC workloads have a lot of similar demands on the network. They just pull on them in different ways. So it&#8217;s like, message rate, for example: message rate is hugely important in things like computational fluid dynamics.</p><p>George: Absolutely.</p><p>Lisa: But it also plays a role in inference. Now, it might be the top determiner of performance in a CFD application, and it might be the third or fourth in an AI application. So you need that same, you know, the latency, the message rates, the bandwidth, the overlap, the communications, you know, all that type of stuff. Just workloads pull on them a little differently.</p><p>So we&#8217;ve built a well-rounded solution that addresses all. And then, by customer use case, you can pull on what you need.</p><p>George: Awesome. And as you can see here [points to network switch on table, you guys make your own switches.</p><p>Lisa: We do.</p><p>George: And you make your own NICs. But one of the questions I have is, can you use the CN6000 NIC with any switch?</p><p>Lisa: OK, so that&#8217;s a great point. And yes, you can. So one of our big focuses as we expand the company, the customer base, and serve more customers and workloads, is becoming much more broadly industry interoperable.</p><p>George: OK.</p><p>Lisa: So we think this is important for larger-scale customers that want to maybe run multi-vendor environments. So we&#8217;re already doing work on the 800 gig to ensure that it works across a variety of, in, you know, standard industry available switches. And that gives customers a lot of flexibility.</p><p>Of course, they can still choose to use both the super NIC and the switch from us. And that&#8217;s great, we love that, but we know that there&#8217;s going to be times when there&#8217;s like, a partner or a use case, where having our NIC paired with someone else&#8217;s switch is the right move. And we fully support it.</p><p>George: So then I guess sort of the flip-side of that is if I have, say, another NIC, could I attach that to an OmniPath switch?</p><p>Lisa: You will be able to, not in a CN6000, but stay tuned, I&#8217;ll have more breaking news! That&#8217;s just a little sneak peek of the future future.</p><p>George: Well, and sort of to round this off with the most important question. What&#8217;s your favorite type of cheese, Lisa?</p><p>Lisa: OK, I am from Portland, Oregon. So I have to go with our local, to the state of Oregon, our Rogue Creamery Blues.</p><p>George: Oh, OK.</p><p>Lisa: I had a chance this summer to go down to Grants Pass, where they&#8217;re from, and headquartered, and we did the whole cheese factory tour- I thought of you. We literally got to meet the cows! So it was very nice, it was very cool. And so that&#8217;s what I have to go with.</p><p>George: One of my favorite cheeses is Tillamook.</p><p>Lisa: OK, yes! Yes, another local favorite.</p><p>George: Thank you so much, Lisa!</p><p>Lisa: Thank you for having me!</p><p>George: Yep, have a good one, folks.</p>]]></content:encoded></item><item><title><![CDATA[Nvidia’s B200: Keeping the CUDA Juggernaut Rolling ft. Verda (formerly DataCrunch)]]></title><description><![CDATA[Nvidia has dominated the GPU compute scene ever since it became mainstream.]]></description><link>https://chipsandcheese.com/p/nvidias-b200-keeping-the-cuda-juggernaut</link><guid isPermaLink="false">https://chipsandcheese.com/p/nvidias-b200-keeping-the-cuda-juggernaut</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Mon, 15 Dec 2025 20:56:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bkbH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Nvidia has dominated the GPU compute scene ever since it became mainstream. The company&#8217;s Blackwell B200 GPU is the next to take up the mantle of being the go-to compute GPU. Unlike prior generations, Blackwell can&#8217;t lean heavily on process node improvements. TSMC&#8217;s 4NP process likely provides something over the 4N process used in the older Hopper generation, but it&#8217;s unlikely to offer the same degree of improvement as prior full node shrinks. Blackwell therefore moves away from Nvidia&#8217;s tried-and-tested monolithic die approach, and uses two reticle sized dies. Both dies appear to software as a single GPU, making the B200 Nvidia&#8217;s first chiplet GPU. Each B200 die physically contains 80 Streaming Multiprocessors (SMs), which are analogous to cores on a CPU. B200 enables 74 SMs per die, giving 148 SMs across the GPU. Clock speeds are similar to the H100&#8217;s high power SXM5 variant.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bkbH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bkbH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 424w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 848w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 1272w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bkbH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png" width="1456" height="481" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:481,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bkbH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 424w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 848w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 1272w, https://substackcdn.com/image/fetch/$s_!bkbH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94c24bc7-2298-4ae3-8fa2-748f47f0af1b_1459x482.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From Nvidia&#8217;s presentation at Hot Chips 2024</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p-54!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p-54!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 424w, https://substackcdn.com/image/fetch/$s_!p-54!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 848w, https://substackcdn.com/image/fetch/$s_!p-54!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 1272w, https://substackcdn.com/image/fetch/$s_!p-54!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p-54!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png" width="1133" height="336" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:336,&quot;width&quot;:1133,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p-54!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 424w, https://substackcdn.com/image/fetch/$s_!p-54!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 848w, https://substackcdn.com/image/fetch/$s_!p-54!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 1272w, https://substackcdn.com/image/fetch/$s_!p-54!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e8a82b-13e4-45bc-bac9-cf25fd313fdf_1133x336.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve listed H100 SXM5 specifications in the table above, but data in the article below will be from the H100 PCIe version unless otherwise specified.</p><h1>Acknowledgements</h1><p>A massive thank you goes to<a href="https://datacrunch.io/"> Verda (formerly DataCrunch)</a> for providing an instance with 8 B200s which are all connected to each other via NVLink. <a href="https://datacrunch.io/">Verda</a> gave us about 3 weeks with the instance to do with as we wished. We previously covered the CPU part of this VM if you want to check that part out as well.</p><h1>Cache and Memory Access</h1><p>B200&#8217;s cache hierarchy feels immediately familiar coming from the H100 and A100. L1 and Shared Memory are allocated out of the same SM-private pool. L1/Shared Memory capacity is unchanged from H100 and stays at 256 KB. Possible L1/Shared Memory splits have not changed either. Shared Memory is analogous to AMD&#8217;s Local Data Share (LDS) or Intel&#8217;s Shared Local Memory (SLM), and provides software managed on-chip storage local to a group of threads. Through Nvidia&#8217;s CUDA API, developers can advise on whether to prefer a larger L1 allocation, prefer equal, or prefer more Shared Memory. Those options appear to give 216, 112, and 16 KB of L1 cache capacity, respectively.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XsDI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XsDI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 424w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 848w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 1272w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XsDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png" width="1456" height="660" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e907f289-869e-4108-b000-e37228e13429_1600x725.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XsDI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 424w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 848w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 1272w, https://substackcdn.com/image/fetch/$s_!XsDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe907f289-869e-4108-b000-e37228e13429_1600x725.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In other APIs, Shared Memory and L1 splits are completely up to Nvidia&#8217;s driver. OpenCL gets the largest 216 KB data cache allocation with a kernel that doesn&#8217;t use Shared Memory, which is quite sensible. Vulkan gets a slightly smaller 180 KB L1D allocation. L1D latency as tested with array indexing in OpenCL comes in at a brisk 19.6 ns, or 39 cycles.</p><p>As with A100 and H100, B200 uses a partitioned L2 cache. However, it&#8217;s now much larger with 126 MB of total capacity. For perspective, H100 had 50 MB of L2, and A100 had 40 MB. L2 latency to the directly attached L2 partition is similar to prior generations at about 150 ns. Latency dramatically increases as test sizes spill into the other L2 partition. B200&#8217;s cross-partition penalty is higher than its predecessors, but only slightly so. L2 partitions on B200 almost certainly correspond to its two dies. If so, the cross-die latency penalty is small, and outweighed by a single L2 partition having more capacity than H100&#8217;s entire L2.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IZ3f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IZ3f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 424w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 848w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 1272w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IZ3f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png" width="1456" height="648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:648,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IZ3f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 424w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 848w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 1272w, https://substackcdn.com/image/fetch/$s_!IZ3f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fc4c41e-4c09-4078-8283-fd26b41abe17_1600x712.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>B200 acts like it has a triple level cache setup from a single thread&#8217;s perspective. The L2&#8217;s partitioned nature can be shown by segmenting the pointer chasing array and having different threads traverse each segment. Curiously, I need a large number of threads before I can access most of the 126 MB capacity without incurring cross-partition penalties. Perhaps Nvidia&#8217;s scheduler tries to fill one partition&#8217;s SMs before going to the other.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZgOD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZgOD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 424w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 848w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 1272w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZgOD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png" width="1456" height="709" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:709,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZgOD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 424w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 848w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 1272w, https://substackcdn.com/image/fetch/$s_!ZgOD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70f994d4-8694-431f-9203-a16c0c22fa16_1600x779.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AMD&#8217;s Radeon Instinct MI300X has a true triple-level cache setup, which trades blows with B200&#8217;s. Nvidia has larger and faster L1 caches. AMD&#8217;s L2 trades capacity for a latency advantage compared to Nvidia. Finally, AMD&#8217;s 256 MB last level cache offers an impressive combination of latency and high capacity. Latency is lower than Nvidia&#8217;s &#8220;far&#8221; L2 partition.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l_36!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l_36!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 424w, https://substackcdn.com/image/fetch/$s_!l_36!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 848w, https://substackcdn.com/image/fetch/$s_!l_36!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 1272w, https://substackcdn.com/image/fetch/$s_!l_36!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l_36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png" width="1456" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l_36!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 424w, https://substackcdn.com/image/fetch/$s_!l_36!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 848w, https://substackcdn.com/image/fetch/$s_!l_36!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 1272w, https://substackcdn.com/image/fetch/$s_!l_36!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05f1c6bf-929a-498d-ae97-cc13424258fa_1600x661.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One curiosity is that both the MI300X and B200 show more uniform latency across the last level cache when I run multiple threads hitting a segmented pointer chasing array. However, the reasons behind that latency increase are different. The latency increase past 64 MB on AMD appears to be caused by TLB misses, because testing with a 4 KB stride shows a latency increase at the same point. Launching more threads brings more TLB instances into play, mitigating address translation penalties. Cutting out TLB miss penalties also lowers measured VRAM latency on the MI300X. On B200, splitting the array didn&#8217;t lower measured VRAM latency, suggesting TLB misses either weren&#8217;t a significant factor with a single thread, or bringing on more threads didn&#8217;t reduce TLB misses. B200 thus appears to have higher VRAM latency than the MI300X, as well as the older H100 and A100. Just as with the L2 cross-partition penalty, the modest nature of the latency regression versus H100/A100 suggests Nvidia&#8217;s multi-die design is working well.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4J4W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4J4W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 424w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 848w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 1272w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4J4W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png" width="960" height="533" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:533,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4J4W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 424w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 848w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 1272w, https://substackcdn.com/image/fetch/$s_!4J4W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd083f1d3-322d-4880-8bc7-0933a90e5fff_960x533.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenCL&#8217;s local memory space is backed by Nvidia&#8217;s Shared Memory, AMD&#8217;s LDS, or Intel&#8217;s SLM. Testing local memory latency with array accesses shows B200 continuing the tradition of offering excellent Shared Memory latency. Accesses are faster than on any AMD GPU I&#8217;ve tested so far, including high-clocked members of the RDNA line. AMD&#8217;s CDNA-based GPUs have much higher local memory latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vbfB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vbfB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 424w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 848w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 1272w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vbfB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png" width="1080" height="513" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:513,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vbfB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 424w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 848w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 1272w, https://substackcdn.com/image/fetch/$s_!vbfB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97e9adca-3fa6-4b97-829b-6d3c122ba2de_1080x513.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Atomic operations on local memory can be used to exchange data between threads in the same workgroup. On Nvidia, that means threads running on the same SM. Bouncing data between threads using atomic_cmpxchg shows latency on par with AMD&#8217;s MI300X. As with pointer chasing latency, B200 gives small incremental improvements over prior generations. AMD&#8217;s RDNA line does very well in this test compared to big compute GPUs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0qZS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0qZS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 424w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 848w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 1272w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0qZS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png" width="1176" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1176,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0qZS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 424w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 848w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 1272w, https://substackcdn.com/image/fetch/$s_!0qZS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd9eb72a2-b13f-4021-8c01-0b8e7ddca51e_1176x610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Modern GPUs use dedicated atomic ALUs to handle operations like atomic adds and increments. Testing with atomic_add gives a throughput of 32 operations per cycle, per SM on the B200. I wrote this test after MI300X testing was concluded, so I only have data from the MI300A. Like GCN, AMD&#8217;s CDNA3 Compute Units can sustain 16 atomic adds per cycle. That lets the B200 pull ahead despite having a lower core count.</p><h1>Bandwidth Measurements</h1><p>A higher SM count gives B200 a large L1 cache bandwidth advantage over its predecessors. It also catches up to AMD&#8217;s MI300X from OpenCL testing. Older, smaller consumer GPUs like the RX 6900XT are left in the dust.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pY4x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pY4x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 424w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 848w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 1272w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pY4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png" width="1128" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:1128,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pY4x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 424w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 848w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 1272w, https://substackcdn.com/image/fetch/$s_!pY4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa321e8a4-a897-40cf-b291-a152ce5994e8_1128x518.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Local memory offers the same bandwidth as L1 cache hits on the B200, because both are backed by the same block of storage. That leaves AMD&#8217;s MI300X with a large bandwidth lead. Local memory is more difficult to take advantage of because developers must explicitly manage data movement, while a cache automatically takes advantage of locality. But it is an area where the huge MI300X continues to hold a lead.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ULep!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ULep!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 424w, https://substackcdn.com/image/fetch/$s_!ULep!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 848w, https://substackcdn.com/image/fetch/$s_!ULep!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 1272w, https://substackcdn.com/image/fetch/$s_!ULep!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ULep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png" width="1075" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:1075,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ULep!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 424w, https://substackcdn.com/image/fetch/$s_!ULep!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 848w, https://substackcdn.com/image/fetch/$s_!ULep!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 1272w, https://substackcdn.com/image/fetch/$s_!ULep!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bba6f9c-172f-4bce-8f15-e8d274503912_1075x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nemez&#8217;s Vulkan-based benchmark provides an idea of what L2 bandwidth is like on the B200. Smaller data footprints contained within the local L2 partition achieve 21 TB/s of bandwidth. That drops to 16.8 TB/s when data starts crossing between the two partitions. AMD&#8217;s MI300X has no graphics API support and can&#8217;t run Vulkan compute. However, AMD has specified that the 256 MB Infinity Cache provides 14.7 TB/s of bandwidth. The MI300X doesn&#8217;t need the same degree of bandwidth from Infinity Cache because the 4 MB L2 instances in front of it should absorb a good chunk of L1 miss traffic.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9yCc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9yCc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9yCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png" width="1456" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9yCc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!9yCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e04028e-9ce9-4bba-986f-6b3324ed5a2b_1600x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The B200 offers a large bandwidth advantage over the outgoing H100 at all levels of the cache hierarchy. Thanks to HBM3E, B200 also gains a VRAM bandwidth lead over the MI300X. While the MI300X also has eight HBM stacks, it uses older HBM3 and tops out at 5.3 TB/s.</p><h1>Global Memory Atomics</h1><p>AMD&#8217;s MI300X showed varying latency when using atomic_cmpxchg to bounce values between threads. Its complex multi-die setup likely contributed to this. The same applies to B200. Here, I&#8217;m launching as many single-thread workgroups as GPU cores (SMs or CUs), and selecting different thread pairs to test. I&#8217;m using an access pattern similar to that of a CPU-side core to core latency test, but there&#8217;s no way to influence where each thread gets placed. Therefore, this isn&#8217;t a proper core to core latency test and results are not consistent between runs. But it is enough to display latency variation, and show that the B200 has a bimodal latency distribution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!giIX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!giIX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 424w, https://substackcdn.com/image/fetch/$s_!giIX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 848w, https://substackcdn.com/image/fetch/$s_!giIX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 1272w, https://substackcdn.com/image/fetch/$s_!giIX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!giIX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png" width="1456" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!giIX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 424w, https://substackcdn.com/image/fetch/$s_!giIX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 848w, https://substackcdn.com/image/fetch/$s_!giIX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 1272w, https://substackcdn.com/image/fetch/$s_!giIX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b6aca5a-a1ad-446d-9fef-39cdeedc96da_1600x805.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Latency for bouncing a value between thread pairs. Because there&#8217;s no good way to pin threads to specific SMs, results vary across runs. However, the bimodal distribution remains.</figcaption></figure></div><p>Latency is 90-100 ns in good cases, likely when threads are on the same L2 partition. Bad cases land in the 190-220 ns range, and likely represents cases when communication crosses L2 partition boundaries. Results on AMD&#8217;s MI300X range from ~116 to ~202 ns. B200&#8217;s good case is slightly better than AMD&#8217;s, while the bad case is slightly worse.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Ahd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Ahd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 424w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 848w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 1272w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Ahd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png" width="1236" height="639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:639,&quot;width&quot;:1236,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Ahd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 424w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 848w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 1272w, https://substackcdn.com/image/fetch/$s_!-Ahd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e4cb73-cc9e-4f48-b447-21ee921e3a94_1236x639.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I only ran a simple two thread test on H100 and A100 because testing all thread pairs takes a long time and cloud instances are expensive</figcaption></figure></div><p>Thread to thread latency is generally higher on datacenter GPUs compared to high clocked consumer GPUs like the RX 6900XT. Exchanging data across a GPU with hundreds of SMs or CUs is challenging, even in the best cases.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JHY6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JHY6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 424w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 848w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 1272w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JHY6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png" width="1181" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1181,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JHY6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 424w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 848w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 1272w, https://substackcdn.com/image/fetch/$s_!JHY6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9360e9d7-d7f8-44d7-8027-378d538abd75_1181x610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Atomic adds on global memory are usually handled by dedicated ALUs at a GPU-wide shared cache level. Nvidia&#8217;s B200 can sustain just short of 512 such operations per cycle across the GPU. AMD&#8217;s MI300A does poorly in this test, achieving lower throughput than the consumer oriented RX 6900XT.</p><h1>Compute Throughput</h1><p>Increased SM count gives the B200 higher compute throughput than the H100 across most vector operations. However, FP16 is an exception. Nvidia&#8217;s older GPUs could do FP16 operations at twice the FP32 rate. B200 cannot.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dF8w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dF8w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 424w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 848w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 1272w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dF8w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png" width="1456" height="1043" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1043,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dF8w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 424w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 848w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 1272w, https://substackcdn.com/image/fetch/$s_!dF8w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506938bc-ade5-4faa-b9b1-d72ee71cd767_1600x1146.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AMD&#8217;s MI300X can also do double rate FP16 compute. Likely, Nvidia decided to focus on handling FP16 with the Tensor Cores, or matrix multiplication units. Stepping back, the MI300X&#8217;s massive scale blows out both the H100 and B200 for most vector operations. Despite using older process nodes, AMD&#8217;s aggressive chiplet setup still holds advantages.</p><h1>Tensor Memory</h1><p>B200 targets AI applications, and no discussion would be complete without covering its machine learning optimizations. Nvidia has used Tensor Cores, or dedicated matrix multiplication units, since the Turing/Volta generation years ago. GPUs expose a SIMT programming model where developers can treat each lane as an independent thread, at least from a correctness perspective. Tensor Cores break the SIMT abstraction by requiring a certain matrix layout across a wave, or vector. Blackwell&#8217;s 5th generation Tensor Cores go one step further, and have a matrix span multiple waves in a workgroup (CTA).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DFh1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DFh1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 424w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 848w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DFh1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png" width="1149" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1149,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DFh1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 424w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 848w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!DFh1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9add64a-11a1-4790-a301-202fd7a6fced_1149x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Diagram of a B200 Streaming Multiprocessor from Nvidia&#8217;s blog post. Note the TMEM blocks</figcaption></figure></div><p>Blackwell also introduces Tensor Memory, or TMEM. TMEM acts like a register file dedicated to the Tensor Cores. Developers can store matrix data in TMEM, and Blackwell&#8217;s workgroup level matrix multiplication instructions use TMEM rather than the register file. TMEM is organized as 512 columns by 128 rows with 32-bit cells. Each wave can only access 32 TMEM rows, determined by its wave index. That implies each SM sub-partition has a 512 column by 32 row TMEM partition. A &#8220;TensorCore Collector Buffer&#8221; can take advantage of matrix data reuse, taking the role of a register reuse cache for TMEM.</p><p>TMEM therefore works like the accumulator register file (Acc VGPRs) on AMD&#8217;s CDNA architecture. CDNA&#8217;s MFMA matrix instructions similarly operate on data in the Acc VGPRs, though MFMA can also take source matrices from regular VGPRs. On Blackwell, only the older wave-level matrix multiplication instructions take regular register inputs. Both TMEM and CDNA&#8217;s Acc VGPRs have 64 KB of capacity, giving both architectures a 64+64 KB split register file per execution unit partition. The regular vector execution units cannot take inputs from TMEM on Nvidia, or the Acc VGPRs on AMD.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SkzK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SkzK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 424w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 848w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 1272w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SkzK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png" width="1143" height="805" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:805,&quot;width&quot;:1143,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SkzK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 424w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 848w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 1272w, https://substackcdn.com/image/fetch/$s_!SkzK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3654aa7-cb5c-4c35-82e8-7a2d7ca9c54e_1143x805.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AMD CDNA&#8217;s split register file layout. Accumulation VGPRs take a similar role to TMEM, while the miSIMD block is analogous to a Tensor Core</figcaption></figure></div><p>While Blackwell&#8217;s TMEM and CDNA&#8217;s Acc VGPRs have similar high level goals, TMEM is a more capable and mature implementation of the split register file idea. CDNA had to allocate the same number of Acc and regular VGPRs for each wave. Doing so likely simplified bookkeeping, but creates an inflexible arrangement where mixing matrix and non-matrix waves would make inefficient use of register file capacity. TMEM in contrast uses a dynamic allocation scheme similar in principle to dynamic VGPR allocation on AMD&#8217;s RDNA4. Each wave starts with no TMEM allocated, and can allocate 32 to 512 columns in powers of two. All rows are allocated at the same time, and waves must explicitly release allocated TMEM before exiting. TMEM can also be loaded from Shared Memory or the regular register file, while CDNA&#8217;s Acc VGPRs can only be loaded through regular VGPRs. Finally, TMEM can optionally &#8220;decompress&#8221; 4 or 6-bit data types to 8 bits as data is loaded in.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8btM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8btM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 424w, https://substackcdn.com/image/fetch/$s_!8btM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 848w, https://substackcdn.com/image/fetch/$s_!8btM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 1272w, https://substackcdn.com/image/fetch/$s_!8btM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8btM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png" width="1456" height="187" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59d1c549-6587-4857-8c0c-817176fba372_1600x206.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:187,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8btM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 424w, https://substackcdn.com/image/fetch/$s_!8btM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 848w, https://substackcdn.com/image/fetch/$s_!8btM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 1272w, https://substackcdn.com/image/fetch/$s_!8btM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59d1c549-6587-4857-8c0c-817176fba372_1600x206.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">How Shared Memory can provide enough bandwidth to feed all four execution unit partitions is beyond me</figcaption></figure></div><p>Compared to prior Nvidia generations, adding TMEM helps reduce capacity and bandwidth pressure on the regular register file. Introducing TMEM is likely easier than expanding the regular register file. Blackwell&#8217;s CTA-level matrix instructions can sustain 1024 16-bit MAC operations per cycle, per partition. Because one matrix input always comes from Shared Memory, TMEM only has to read one row and accumulate into another row every cycle. The regular vector registers would have to sustain three reads and one write per cycle for FMA instructions. Finally, TMEM doesn&#8217;t have to be wired to the vector units. All of that lets Blackwell act like it has a larger register file for AI applications, while allowing hardware simplifications. Nvidia has used 64 KB register files since the Kepler architecture from 2012, so a register file capacity increase feels overdue. TMEM delivers that in a way.</p><p>On AMD&#8217;s side, CDNA2 abandoned dedicated Acc VGPRs and merged all of the VGPRs into a unified 128 KB pool. Going for a larger unified pool of registers can benefit a wider range of applications, at the cost of not allowing certain hardware simplifications.</p><h1>Some Light Benchmarking</h1><p>Datacenter GPUs traditionally have strong FP64 performance, and that continues to be the case for B200. Basic FP64 operations execute at half the FP32 rate, making it much faster than consumer GPUs. In a self penned benchmark, B200 continues to do well compared to both consumer GPUs and H100. However, the MI300X&#8217;s massive size shows through even though it&#8217;s an outgoing GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EG4a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EG4a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 424w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 848w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 1272w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EG4a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png" width="1360" height="842" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:842,&quot;width&quot;:1360,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EG4a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 424w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 848w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 1272w, https://substackcdn.com/image/fetch/$s_!EG4a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3675a3ab-8c64-4a46-9cca-79314971386a_1360x842.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the workload above, I take a 2360x2250 FITS file with column density values and output gravitational potential values with the same dimensions. The data footprint is therefore 85 MB. Even without performance counter data, it&#8217;s safe to assume it fits in the last level cache on both MI300X and B200.</p><p>FluidX3D is a different matter. Its benchmark uses a 256x256x256 cell configuration and 93 bytes per cell in FP32 mode, for a 1.5 GB memory footprint. Its access patterns aren&#8217;t cache friendly, based on testing with performance counters on Strix Halo. FluidX3D plays well into B200&#8217;s VRAM bandwidth advantage, and the B200 now pulls ahead of the MI300X.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6VCy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6VCy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 424w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 848w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 1272w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6VCy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png" width="1408" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6VCy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 424w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 848w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 1272w, https://substackcdn.com/image/fetch/$s_!6VCy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F595807dd-ba6c-4ead-bf6d-c80ad4a22137_1408x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>FluidX3D can also use 16-bit floating point formats for storage, reducing memory capacity and bandwidth requirements. Computations still use FP32 and format conversion costs extra compute, so the FP16 formats result in a higher compute to bandwidth ratio. That typically improves performance because FluidX3D is so bandwidth bound. When using IEEE FP16 for storage, AMD&#8217;s MI300A catches up slightly but leaves the B200 ahead by a large margin.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bdEy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bdEy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 424w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 848w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 1272w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bdEy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png" width="929" height="547" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce7444b5-397a-42ce-906b-61c6c383b136_929x547.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:547,&quot;width&quot;:929,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bdEy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 424w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 848w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 1272w, https://substackcdn.com/image/fetch/$s_!bdEy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce7444b5-397a-42ce-906b-61c6c383b136_929x547.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Another FP16C format reduces the accuracy penalty associated with using 16 bit storage formats. It&#8217;s a custom floating point format without hardware support, which further drives up the compute to bandwidth ratio.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zaTh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zaTh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 424w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 848w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 1272w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zaTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png" width="946" height="542" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:542,&quot;width&quot;:946,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zaTh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 424w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 848w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 1272w, https://substackcdn.com/image/fetch/$s_!zaTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1322807d-ae65-4d9d-bcce-d30df84cb152_946x542.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With compute front and center again, AMD&#8217;s MI300A pulls ahead. The B200 doesn&#8217;t do badly, but it can&#8217;t compete with the massive compute throughput that AMD&#8217;s huge chiplet GPU provides.</p><h1>Teething Issues</h1><p>We encountered three GPU hangs over several weeks of testing. The issue manifested with a GPU process getting stuck. Then, any process trying to use any of the system&#8217;s eight GPUs would also hang. None of the hung processes could be terminated, even with SIGKILL. Attaching GDB to one of them would cause GDB to freeze as well. The system remained responsive for CPU-only applications, but only a reboot restored GPU functionality. nvidia-smi would also get stuck. Kernel messages indicated the Nvidia unified memory kernel module, or nvidia_uvm, took a lock with preemption disabled.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7xY2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7xY2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 424w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 848w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 1272w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7xY2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png" width="866" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33291205-ceb8-456e-8462-24a6225c2791_866x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:866,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7xY2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 424w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 848w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 1272w, https://substackcdn.com/image/fetch/$s_!7xY2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33291205-ceb8-456e-8462-24a6225c2791_866x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The stack trace suggests Nvidia might be trying to free allocated virtual memory, possibly on the GPU. Taking a lock makes sense because Nvidia probably doesn&#8217;t want other threads accessing a page freelist while it&#8217;s being modified. Why it never leaves the critical section is anyone&#8217;s guess. Perhaps it makes a request to the GPU and never gets a response. Or perhaps it&#8217;s purely a software deadlock bug on the host side.</p><p><code># nvidia-smi -r</code></p><p><code>The following GPUs could not be reset:</code></p><p><code>  GPU 00000000:03:00.0: In use by another client</code></p><p><code>  GPU 00000000:04:00.0: In use by another client</code></p><p><code>  GPU 00000000:05:00.0: In use by another client</code></p><p><code>  GPU 00000000:06:00.0: In use by another client</code></p><p><code>  GPU 00000000:07:00.0: In use by another client</code></p><p><code>  GPU 00000000:08:00.0: In use by another client</code></p><p><code>  GPU 00000000:09:00.0: In use by another client</code></p><p><code>  GPU 00000000:0A:00.0: In use by another client</code></p><p>Hangs like this aren&#8217;t surprising. Hardware acceleration adds complexity, which translates to more failure points. But modern hardware stacks have evolved to handle GPU issues without a reboot. Windows&#8217;s Timeout Detection and Recovery (TDR) mechanism for example can ask the driver to reset a hung GPU. nvidia-smi does offer a reset option. But frustratingly, it doesn&#8217;t work if the GPUs are in use. That defeats the purpose of offering a reset option. I expect Nvidia to iron out these issues over time, especially if the root cause lies purely with software or firmware. But hitting this issue several times within such a short timespan isn&#8217;t a good sign, and it would be good if Nvidia could offer ways to recover from such issues without a system reboot.</p><h1>Final Words</h1><p>Nvidia has made the chiplet jump without any major performance concessions. B200 is a straightforward successor to H100 and A100, and software doesn&#8217;t have to care about the multi-die setup. Nvidia&#8217;s multi-die strategy is conservative next to the 12-die monster that is AMD&#8217;s MI300X. MI300X retains some surprising advantages over Nvidia&#8217;s latest GPU, despite being an outgoing product. AMD&#8217;s incoming datacenter GPUs will likely retain those advantages, while catching up in areas that the B200 has pulled ahead in. The MI350X for example will bring VRAM bandwidth to 8 TB/s.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tZ1K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tZ1K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 424w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 848w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 1272w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tZ1K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png" width="1456" height="825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:825,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tZ1K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 424w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 848w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 1272w, https://substackcdn.com/image/fetch/$s_!tZ1K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d54c39b-b7ab-4a1b-b347-ec61b815b5da_1600x907.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From Nvidia&#8217;s presentation at Hot Chips 2024</figcaption></figure></div><p>But Nvidia&#8217;s conservative approach is understandable. Their strength doesn&#8217;t lie in being able to build the biggest, baddest GPU around the block. Rather, Nvidia benefits from their CUDA software ecosystem. GPU compute code is typically written for Nvidia GPUs first. Non-Nvidia GPUs are an afterthought, if they&#8217;re thought about at all. Hardware isn&#8217;t useful without software to run on it, and quick ports won&#8217;t benefit from the same degree of optimization. Nvidia doesn&#8217;t need to match the MI300X or its successors in every area. They just have to be good enough to prevent people from filling in the metaphorical CUDA moat. Trying to build a monster to match the MI300X is risky, and Nvidia has every reason to avoid risk when they have a dominant market position.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nEcW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nEcW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 424w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 848w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 1272w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nEcW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png" width="1456" height="914" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:914,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nEcW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 424w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 848w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 1272w, https://substackcdn.com/image/fetch/$s_!nEcW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d29c6c-b802-4601-ac71-d50ef018e684_1536x964.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From Nvidia&#8217;s blog post</figcaption></figure></div><p>Still, Nvidia&#8217;s strategy leaves AMD with an opportunity. AMD has everything to gain from being ambitious and taking risks. GPUs like the MI300X are impressive showcases of hardware engineering, and demonstrate AMD&#8217;s ability to take on ambitious design goals. It&#8217;ll be interesting to see whether Nvidia&#8217;s conservative hardware strategy and software strength will result in its continued dominance.</p><p>Again, a massive thank you goes to <a href="https://verda.com/?utm_source=chips-and-cheese&amp;utm_medium=referral&amp;utm_campaign=sponsorships&amp;utm_content=chips-and-cheese-b200">Verda (formerly DataCrunch)</a> without which this article would not be possible! If you want to try out the B200 or test other NVIDIA GPUs yourself, Verda is offering free trial credits specifically for Chips &amp; Cheese readers. Simply enter the code &#8220;CHEESE-B200&#8221; and redeem 50$ worth of credit and follow these <a href="https://docs.verda.com/resources/obtaining-free-credits/how-to-redeem-credits">instructions on how to redeem coupon credits</a>.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p><h2><strong>References</strong></h2><ol><li><p><a href="https://docs.nvidia.com/cuda/blackwell-tuning-guide/index.html">Blackwell Tuning Guide, indicating 126 MB of L2 capacity</a></p></li><li><p><a href="https://developer.nvidia.com/blog/inside-nvidia-blackwell-ultra-the-chip-powering-the-ai-factory-era/">Inside Blackwell (Nvidia site)</a></p></li><li><p><a href="https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-memory-alloc-manage-instructions">Tensor Memory Documentation</a></p></li><li><p><a href="https://resources.nvidia.com/en-us-hopper-architecture/nvidia-h100-tensor-c">Hopper Whitepaper</a></p></li><li><p><a href="https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/instinct-mi100-cdna1-shader-instruction-set-architecture.pdf">AMD CDNA (MI100) ISA Manual</a></p></li></ol>]]></content:encoded></item><item><title><![CDATA[SC25: Estimating AMD’s Upcoming MI430X’s FP64 and the Discovery Supercomputer]]></title><description><![CDATA[Hello, you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/sc25-estimating-amds-upcoming-mi430xs</link><guid isPermaLink="false">https://chipsandcheese.com/p/sc25-estimating-amds-upcoming-mi430xs</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Wed, 10 Dec 2025 16:08:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fyww!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello, you fine Internet folks,</p><p>At Supercomputing 2025, EuroHPC, AMD, and Eviden announced that the second Exascale system in Europe was going to be called Alice Recoque. Powered by Eviden&#8217;s BullSequena XH3500 platform using AMD&#8217;s upcoming Instinct MI430X as most of the compute for the supercomputer. Alice Recoque is going &#8220;exceed a sustained performance of 1 Exaflop/s HPL performance&#8230; with less than 15 Megawatts of electrical power.&#8221;</p><p>Now, AMD hasn&#8217;t said what the FLOPs of MI430X is but there is likely enough to speculate as to the potential FP64 FLOPs of MI430X as a thought experiment.</p><p>In terms of what we know:</p><p>-          Alice Recoque is going to be made of 94 XH3500 racks</p><p>-          Alice Recoque uses less than 15 Megawatts of power in actual usage, but the facility can provide 24 Megawatts of power and 20 Megawatts of cooling</p><p>-          BullSequena XH3500 has a cap of 264 Kilowatts per rack</p><p>-          BullSequena XH3500 is 38U per rack</p><p>-          BullSequena XH3500&#8217;s Sleds are 2U for Switch Sleds and 8 Compute Sleds and 1U for 4 Compute Sleds</p><p>What we also know is that HPE rates a single MI430X between 2000 watts and 2500 watts of power draw.</p><p>Alice Recoque has 3 different possible energy consumption numbers:</p><ol><li><p>Less than 15 Megawatts</p></li><li><p>The facility thermal limit of 20 Megawatts of cooling</p></li><li><p>Facility power limit of 24 Megawatts</p></li></ol><p>This allows for 3 possible configurations of the XH3500 racks:</p><ol><li><p> For the sub 15 Megawatt system, 16 Compute nodes each with 1 Venice CPU and 4 MI430X along with 8 switch blades</p></li><li><p>For the 20 Megawatt system, 18 Compute nodes each with 1 Venice CPU and 4 MI430X along with 9 switch blades</p></li><li><p>For the 24 Megawatt system, 20 Compute nodes each with 1 Venice CPU and 4 MI430X along with 8 switch blades</p></li></ol><p>For this calculation, I am going to run with the middle configuration of 18 Compute nodes and 9 switch nodes per rack along with a maximum sustained energy consumption of approximately 20 Megawatts for the full Alice Recoque Supercomputer. This puts the maximum power consumption of a single XH3500 rack at about 200 Kilowatts. As such I am going to run with the assumption that each of the Compute nodes can pull approximately 10.5 Kilowatts and as a result each of the MI430Xs will be estimated to have a TDP of about 2300 watts. This gives 1300 watts for the remainder of the Compute blade including the Venice CPU which has a TDP of up to 600 watts.</p><p>With 94 racks, 18 Compute nodes, and 4 MI430Xs per blade, this gives a total of 6768 GPUs for the whole Alice Recoque Supercomputer. Assuming the &#8220;exceed a sustained performance of 1 Exaflop/s HPL performance&#8221; is the HPL Rmax value of Alice Recoque and that the Rmax to Rpeak ratio is approximately 70% (similar to Frontier&#8217;s ratio), this puts Alice Recoque&#8217;s HPL Rpeak at a minimum of 1.43 Exaflops. Now dividing the HPL Rpeak number by the number of MI430Xs get you a FP64 Vector FLOPs number for MI430X of approximately 211 Teraflops.</p><p>MI430X pairs this approximate 211 Teraflops of FP64 Vector with 432 Gigabytes of HBM4 providing the MI430X with 19.6 TB/s of memory bandwidth; this is unsurprisingly the same memory subsystem as MI450X. An important metric in HPC is the amount of compute for the given memory bandwidth which is usually denoted as FLOPs per Byte or F/B. For many HPC tasks, the arithmetic intensity for those problems is low, a lower FLOPs per Byte number is preferred due to the majority of HPC code being memory bandwidth bound.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fyww!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fyww!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 424w, https://substackcdn.com/image/fetch/$s_!fyww!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 848w, https://substackcdn.com/image/fetch/$s_!fyww!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 1272w, https://substackcdn.com/image/fetch/$s_!fyww!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fyww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png" width="752" height="437" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:437,&quot;width&quot;:752,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fyww!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 424w, https://substackcdn.com/image/fetch/$s_!fyww!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 848w, https://substackcdn.com/image/fetch/$s_!fyww!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 1272w, https://substackcdn.com/image/fetch/$s_!fyww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767a296f-70b4-4732-a7dd-c1cc8f07aaa4_752x437.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So assuming 211TF of FP64 Vector for MI430X, this puts MI430X ahead of both of AMD&#8217;s prior HPC focused accelerators in terms of FLOPs per Byte. However, MI430X does still have a higher FP64 FLOPs per Byte ratio compared to Nvidia&#8217;s offerings. But MI430X does have two aces up its sleeve compared to Nvidia&#8217;s latest and upcoming accelerators.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J50u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J50u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 424w, https://substackcdn.com/image/fetch/$s_!J50u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 848w, https://substackcdn.com/image/fetch/$s_!J50u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 1272w, https://substackcdn.com/image/fetch/$s_!J50u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J50u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png" width="752" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:752,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J50u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 424w, https://substackcdn.com/image/fetch/$s_!J50u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 848w, https://substackcdn.com/image/fetch/$s_!J50u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 1272w, https://substackcdn.com/image/fetch/$s_!J50u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe9ce509-5e6b-439d-89f4-2cc1c5bd5942_752x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first ace is that MI430X has significantly more bandwidth compared to AMD&#8217;s prior offering along with having even more memory bandwidth than Nvidia&#8217;s upcoming Rubin accelerator which is important due to the number of memory bandwidth bound tasks in HPC.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YTOM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YTOM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 424w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 848w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 1272w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YTOM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png" width="752" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/122831ee-7c2a-4794-a872-a95042cefc05_752x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:752,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YTOM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 424w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 848w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 1272w, https://substackcdn.com/image/fetch/$s_!YTOM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122831ee-7c2a-4794-a872-a95042cefc05_752x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The second ace is that MI430X has nearly 3.5 times the HBM capacity of AMD&#8217;s prior accelerators and 50% more HBM capacity than Nvidia&#8217;s upcoming Rubin which means that a larger dataset can be fit on a single MI430X.</p><h3>The Upcoming Discovery Supercomputer at ORNL</h3><p>Just prior to Supercomputing 2025, AMD, HPE, and the Department of Energy announced the replacement of Frontier, codenamed Discovery, which is due to be delivered in 2028 and turned on in 2029 at the Oak Ridge National Laboratory in Oak Ridge, Tennessee.</p><p>Again, we don&#8217;t know much about Discovery other than it is going to use HPE&#8217;s new GX5000 platform and that it will use AMD&#8217;s Venice CPUs and MI430X accelerators.</p><p>Speaking of HPE&#8217;s GX5000 platform, it has 3 initial compute blade configurations:</p><ol><li><p>GX250: The GX250 blade has 8 Venice CPUs with up to 40 blades per rack for a total of up to 320 Venice CPUs per GX5000 rack</p></li><li><p>GX350a: The GX350a blade has 1 Venice CPU and 4 MI430X Accelerators with up to 28 blades per rack for a total of 28 Venice CPUs and 112 MI430X Accelerators per GX5000 rack</p></li><li><p>GX440n: The GX440n blade had 4 Nvidia Vera CPUs and 8 Rubin Accelerators per blade with up to 24 blades per rack for a total of 96 Vera CPUs and 192 Rubin Accelerators per GX5000 rack</p></li></ol><p>The current GX5000 platform can supply up to 400 kilowatts per rack which is likely for the full GX440n configuration where the 192 Rubins, each rated for 1800 watts, alone pull about 350 kilowatts let alone the CPUs, memory, etc. The GX5000 is also about half the floor area compared to the prior generation of EX4000 (1.08 m^2 vs 2.055 m^2). This means that you can fit 2 GX5000 racks into the area of a single EX4000 rack.</p><p>For Discovery, the configuration that we are interested in is the GX350a configuration of the GX5000. Now, what hasn&#8217;t been announced yet is the HPL speedup goal but it is expected that Discovery will deliver &#8220;three to five times more computational throughput for benchmark and scientific applications than Frontier.&#8221;</p><p>Due to the vagueness of what exactly the three to five times performance increase refers to, whether this is three to five times faster in actual HPC workloads or if this is three to five times faster in LINPACK, I am going to propose 2 different configurations for Discovery:</p><ol><li><p>A configuration that can fit into the current power and floor footprint of Frontier&#8217;s building</p></li></ol><ol start="2"><li><p>A configuration that is approximately 4 times the Rpeak of Frontier at the time of the RFP was closed on August 30th, 2024 which was approximately 1.714 Exaflops</p></li></ol><p>Now for the first configuration, Frontier has 74 EX4000 Racks for its compute system which means that approximately 140 GX5000 racks can fit into that floor space. This means that Discovery would have a total of 3,920 Venice CPUs and 15,680 MI430Xs Accelerators for a rough total of 3.3 Exaflops of HPL Rpeak.</p><p>This Rpeak of 3 Exaflops would have a rough power draw of about 35 Megawatts assuming each of the GX5000 racks uses approximately 250 Kilowatts per rack. While this is 10 Megawatts more than what Frontier draws in HPL, the building for Frontier is designed to be able to distribute up to 40 Megawatts so this 140 rack configuration does just fit into the power and floor space of Frontier&#8217;s building. However, if you decrease the per-rack power to 160 Kilowatts then Discovery could easily fit into Frontier&#8217;s power footprint.</p><p>For the second configuration, I am taking the top end estimate of Discovery as five times faster than Frontier and running with a configuration that is approximately five times the Rpeak of Frontier ca. August 2024 which would make Discovery a ~8.5 Exaflop system. This would need approximately 360 GX5000 racks for a total of 10,080 Venice CPUs and 40,320 MI430X Accelerators.</p><p>This configuration would likely need a facilities upgrade for both power and floor space to accommodate the system. For the floor space, this configuration may require over 1,600 m^2. And for the power, assuming 250 Kilowatts per GX5000 rack, this configuration would consume over 90 Megawatts; however, toning down the per-rack power to around 160 Kilowatts would put Discovery into the 55 to 60 Megawatt range.</p><p>The most likely configuration of Discovery is probably somewhere in-between these two configurations. However, it is rumored that China may have a Supercomputer that is able to get over 2 Exaflops of HPL Rmax at 300 megawatts with a potential companion that uses 2.5 times the power at 800 megawatts which could get over 5 Exaflops of HPL Rmax. So this may drive the Department of Energy to have chosen the largest configuration or possibly a configuration that is even larger than five times Frontier.</p><p>Regardless of what Discovery ends up being in terms of the final configuration, it is undeniably an exciting time in the World of HPC!</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Evaluating Uniform Memory Access Mode on AMD's Turin ft. Verda (formerly DataCrunch.io)]]></title><description><![CDATA[How does uniform memory access play out as interconnects get increasingly non-uniform?]]></description><link>https://chipsandcheese.com/p/evaluating-uniform-memory-access</link><guid isPermaLink="false">https://chipsandcheese.com/p/evaluating-uniform-memory-access</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Wed, 26 Nov 2025 20:46:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!mCZP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>NUMA, or Non-Uniform Memory Access, lets hardware expose affinity between cores and memory controllers to software. NUMA nodes traditionally aligned with socket boundaries, but modern server chips can subdivide a socket into multiple NUMA nodes. It&#8217;s a reflection of how non-uniform interconnects get as core and memory controller counts keep going up. AMD designates their NUMA modes with the NPS (Nodes Per Socket) prefix.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_IKK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_IKK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 424w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 848w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 1272w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_IKK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png" width="598" height="212" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:212,&quot;width&quot;:598,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39680,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_IKK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 424w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 848w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 1272w, https://substackcdn.com/image/fetch/$s_!_IKK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc7d1b01-cd33-4573-a062-d1ceca102947_598x212.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>NPS0 is a special NUMA mode that goes in the other direction. Rather than subdivide the system, NPS0 exposes a dual socket system as a single monolithic entity. It evenly distributes memory accesses across all memory controller channels, providing uniform memory access like in a desktop system. NPS0 and similar modes exist because optimizing for NUMA can be complicated and time intensive. Programmers have to specify a NUMA node for each memory allocation, and take are to minimize cross-node memory accesses. Each NUMA node only represents a fraction of system resources, so code pinned to a NUMA node will be constrained by that node&#8217;s CPU core count, memory bandwidth, and memory capacity. Effort spent getting an application to scale across NUMA nodes might be effort not spent on a software project&#8217;s other goals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mCZP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mCZP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 424w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 848w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 1272w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mCZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png" width="721" height="318" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b11778f-12bc-412e-accb-deb2d417a214_721x318.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:318,&quot;width&quot;:721,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49033,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mCZP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 424w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 848w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 1272w, https://substackcdn.com/image/fetch/$s_!mCZP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b11778f-12bc-412e-accb-deb2d417a214_721x318.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From AMD&#8217;s EPYC 9005 Series Architecture Overview, showing a dual socket Zen 5 (Turin) setup in NPS1 mode</figcaption></figure></div><h1>Acknowledgements</h1><p>A massive thank you goes to <a href="https://verda.com/">Verda (formerly DataCrunch)</a> for proving an instance with 2 AMD EPYC 9575Fs and 8 Nvidia B200 GPUs. <a href="https://verda.com/">Verda</a> gave us about 3 weeks with the instance to do with as we wished. While this article looks at the AMD EPYC 9575Fs, there will be upcoming coverage of the B200s found in the VM.</p><p>This system appears to be running in NPS0 mode, giving an opportunity to see how a modern server acts with 24 memory controllers providing uniform memory access.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j8tU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j8tU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 424w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 848w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 1272w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j8tU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53211,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j8tU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 424w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 848w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 1272w, https://substackcdn.com/image/fetch/$s_!j8tU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c23e668-1aef-49f7-9281-eaef32d6cbe2_1644x761.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A simple latency test immediately shows the cost of providing uniform memory access. DRAM latency rises to over 220 ns, giving a nearly 90 ns penalty over the EPYC 9355P running in NPS1 mode. It&#8217;s a high penalty compared to using the equivalent of NPS0 on older systems. For example, a dual socket Broadwell system has 75.8 ns of DRAM latency when each socket is treated as a NUMA node, and 104.6 ns with uniform memory access[1].</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PbE2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PbE2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 424w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 848w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 1272w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PbE2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png" width="1347" height="668" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:668,&quot;width&quot;:1347,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25795,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PbE2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 424w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 848w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 1272w, https://substackcdn.com/image/fetch/$s_!PbE2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac990f26-af46-4543-bb95-1a523c0f350a_1347x668.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>NPS0 mode does have a bandwidth advantage from bringing twice as many memory controllers into play. But the extra bandwidth doesn&#8217;t translate to a latency advantage until bandwidth demands reach nearly 400 GB/s. The EPYC 9355P seems to suffer when a latency test thread is mixed with bandwidth heavy ones. A bandwidth test thread with just linear read patterns <a href="https://chipsandcheese.com/p/amds-epyc-9355p-inside-a-32-core">can achieve 479 GB/s in NPS1 mode</a>. However, my bandwidth test produces low values on the EPYC 9575F because not all test threads finish at the same time. I avoid this problem in the loaded memory latency test, because I have bandwidth load threads check a flag. That lets me stop all threads at approximately the same time.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nzZi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nzZi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 424w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 848w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 1272w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nzZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png" width="597" height="101" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:101,&quot;width&quot;:597,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18532,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nzZi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 424w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 848w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 1272w, https://substackcdn.com/image/fetch/$s_!nzZi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bd90d10-57d4-474d-b0dc-b5e82da589a2_597x101.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Per-CCD bandwidth is barely affected by the different NPS modes. Both the EPYC 9355P and 9575F use &#8220;GMI-Wide&#8221; links for their Core Complex Dies, or CCDs. GMI-Wide provides 64B/cycle of read and write bandwidth at the Infinity Fabric clock. On both chips, each CCD enjoys more bandwidth to the system compared to standard &#8220;GMI-Narrow&#8221; configurations. For reference, a GMI-Narrow setup running at a typical desktop 2 GHz FCLK would be limited to 64 GB/s of read and 32 GB/s of write bandwidth.</p><h1>Performance: SPEC CPU2017</h1><p>Higher memory latency could lead to lower performance, especially in single threaded workloads. But the EPYC 9575F does surprisingly well in SPEC CPU2017. The EPYC 9575F runs at a higher 5 GHz clock speed, and DRAM latency is only one of many factors that affect CPU performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Objy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Objy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 424w, https://substackcdn.com/image/fetch/$s_!Objy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 848w, https://substackcdn.com/image/fetch/$s_!Objy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 1272w, https://substackcdn.com/image/fetch/$s_!Objy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Objy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png" width="1267" height="711" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:711,&quot;width&quot;:1267,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33179,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Objy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 424w, https://substackcdn.com/image/fetch/$s_!Objy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 848w, https://substackcdn.com/image/fetch/$s_!Objy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 1272w, https://substackcdn.com/image/fetch/$s_!Objy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55646c2-c10e-48b8-938c-570cd934ee32_1267x711.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Individual workloads show a more complex picture. The EPYC 9575F does best when workloads don&#8217;t miss cache. Then, its high 5 GHz clock speed can shine. 548.exchange2 is an example. On the other hand, workloads that hit DRAM a lot suffer in NPS0 mode. 502.gcc, 505.mcf, and 520.omnetpp see the EPYC 9575F&#8217;s higher clock speed count for nothing, and the higher clocked chip underperforms compared to 4.4 GHz setups with lower DRAM latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u1n2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u1n2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 424w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 848w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u1n2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png" width="1239" height="1016" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1016,&quot;width&quot;:1239,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u1n2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 424w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 848w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!u1n2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc2b2f7-5687-4e3b-95bc-e7da41f6bd07_1239x1016.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>SPEC CPU2017&#8217;s floating point suite also shows diverse behavior. 549.fotonik3d and 554.roms suffer in NPS0 mode as the EPYC 9575F struggles to keep itself fed. 538.imagick plays nicely to the EPYC 9575F&#8217;s advantages. In that test, high cache hitrates let the 9575F&#8217;s higher core throughput shine through.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XLuq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XLuq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 424w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 848w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 1272w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XLuq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png" width="1246" height="1199" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1199,&quot;width&quot;:1246,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52555,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/177853996?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XLuq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 424w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 848w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 1272w, https://substackcdn.com/image/fetch/$s_!XLuq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc130cb58-e88b-4fe3-bb45-f2336050b886_1246x1199.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Final Words</h1><p>NPS0 mode performs surprisingly well in a single threaded SPEC CPU2017 run. Some sub-tests suffer from higher memory latency, but enough other tests benefit from the higher 5 GHz clock speed to make up the difference. It&#8217;s a lesson about the importance of clock speeds and good caching in a modern server CPU. Those two factors go together, because faster cores only provide a performance advantage if the memory subsystem can feed them. The EPYC 9575F&#8217;s good overall performance despite having over 220 ns of memory latency shows how good its caching setup is.</p><p>As for running in NPS0 mode, I don&#8217;t think it&#8217;s worthwhile in a modern system. The latency penalty is very high, and bandwidth gains are minor for NUMA-unaware code. I expect those latency penalties to get worse as server core and memory controller counts continue to increase. For workloads that need to scale across socket boundaries, optimizing for NUMA looks to be an unfortunate necessity.</p><p>Again, a massive thank you goes to <a href="https://verda.com/">Verda (formerly DataCrunch)</a> without which this article, and the upcoming B200 article, would not be possible!</p>]]></content:encoded></item><item><title><![CDATA[SC25: HACCing over 500 Petaflops on Frontier ]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/sc25-haccing-over-500-petaflops-on</link><guid isPermaLink="false">https://chipsandcheese.com/p/sc25-haccing-over-500-petaflops-on</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Sat, 22 Nov 2025 03:57:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AhAl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks,</p><p>Here at Supercomputing the Gordon Bell Prize is announced every year. The Gordon Bell Prize is awarded every year to recognize outstanding achievement in high-performance computing applications. </p><p><a href="https://dl.acm.org/doi/10.1145/3712285.3771786">One of this year&#8217;s finalist</a> is the largest ever simulation of the Universe run on the Frontier Supercomputer at the Department of Energy&#8217;s Oak Ridge National Laboratory (ORNL). The simulation was run using the <a href="https://cpac.hep.anl.gov/projects/hacc/">Hardware/Hybrid Accelerated Cosmology Code</a> also known as HACC. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AhAl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AhAl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 424w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 848w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 1272w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AhAl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png" width="1456" height="1075" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1069192,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179398527?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AhAl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 424w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 848w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 1272w, https://substackcdn.com/image/fetch/$s_!AhAl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6871f57b-856f-4cc0-80b9-77e8d1dc98df_2061x1522.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.olcf.ornl.gov/wp-content/uploads/2017/11/Frontiere.pdf">Presentation of HACC, what the math behind the code is, how it was implemented on GPUs, and the scaling of the code on GPUs.</a></figcaption></figure></div><p>This simulation of the observable universe tracked over 4 trillion particles across 15 billion light years of space. The prior state of the art observable universe simulations only went up to about 250 billion particles which is a fifteenth the number of particles of this new simulation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iCsL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iCsL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 424w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 848w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 1272w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iCsL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png" width="780" height="486" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:486,&quot;width&quot;:780,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:503707,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179398527?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iCsL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 424w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 848w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 1272w, https://substackcdn.com/image/fetch/$s_!iCsL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8cbad7c-ac53-4587-86d2-64014b1919b3_780x486.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This HACC simulation shows the universe about 10 billion years after the Big Bang.</figcaption></figure></div><p>But, not only was this the largest universe simulation ever, the ORNL team managed to get over 500 Petaflops on nearly 9,000 nodes of Frontier&#8217;s 9,402 nodes. As a reminder, Frontier manages to get approximately 1,353 Petaflops on High Performance Linpack (HPL). This means that for this simulation the ORNL team managed to get about 37% of the Rmax HPL performance out of Frontier which is very impressive for a non-synthetic workload.</p><p>It is awesome to see the Department of Energy&#8217;s (DOE) supercomputers being used for amazing science like this! With the announcement of the Discovery Supercomputer that is due in 2028/2029, I can&#8217;t wait to see the science that comes out of that system when it is turned over to the scientific community!</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Qualcomm’s Snapdragon X2 Elite]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/qualcomms-snapdragon-x2-elite</link><guid isPermaLink="false">https://chipsandcheese.com/p/qualcomms-snapdragon-x2-elite</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Wed, 19 Nov 2025 14:10:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1P59!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks,</p><p>Last week I was in San Diego at Qualcomm&#8217;s Headquarters where Qualcomm disclosed more information about their upcoming Snapdragon X2 Elite SOC.</p><p>Snapdragon X2 Elite is Qualcomm&#8217;s newest SOC for the Windows on ARM ecosystem that is designed to bring a new level of performance so let&#8217;s dive in.</p><h2>Oryon Gen 3 Prime CPU Cores and Cluster</h2><p>The Snapdragon X2 Elite (SDX2E) comes equipped with a total of 18 CPU cores with 12 &#8220;Prime&#8221; cores and 6 &#8220;Performance&#8221; cores.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1P59!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1P59!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!1P59!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!1P59!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!1P59!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1P59!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:664007,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1P59!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!1P59!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!1P59!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!1P59!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd6f2d4a-b038-4c63-82b2-1bb2be6da068_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with the Prime cores, these are the real heart of the SDX2E SoC with a total of up to 12 cores split across 2 clusters that can clock up to 5.0 GHz.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zrUY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zrUY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 424w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 848w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 1272w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zrUY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png" width="1374" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:1374,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:788648,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zrUY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 424w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 848w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 1272w, https://substackcdn.com/image/fetch/$s_!zrUY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68acb3f6-d189-4d53-9db6-23ac61d9ba1a_1374x772.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each of these clusters has 16MB, 16-way associative, shared L2 cache with 6 Prime cores attached along with a Qualcomm Matrix Engine per cluster.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H3op!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H3op!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 424w, https://substackcdn.com/image/fetch/$s_!H3op!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 848w, https://substackcdn.com/image/fetch/$s_!H3op!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 1272w, https://substackcdn.com/image/fetch/$s_!H3op!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H3op!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png" width="1364" height="769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:769,&quot;width&quot;:1364,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:753725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H3op!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 424w, https://substackcdn.com/image/fetch/$s_!H3op!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 848w, https://substackcdn.com/image/fetch/$s_!H3op!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 1272w, https://substackcdn.com/image/fetch/$s_!H3op!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F623dfb13-e69b-47f1-a620-0e060678be46_1364x769.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The L2 can serve up to 64B per cycle per core with a total fill bandwidth of up to 256B per cycle for the cluster. The L1 miss to L2 hit latency is now 21 cycles, up from the 17 cycles of the Snapdragon X Elite (SDXE). The reason for this increase is due to the increased size of the structure. The L2 runs at the same clocks as the cores and supports over 220 in-flight transactions with each core supporting over 50 requests to the L2 at a time.</p><p>Diving into the CPU cores and there is quite a bit familiar with this core at a high level.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rsrl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rsrl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 424w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 848w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 1272w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rsrl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png" width="1363" height="760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:760,&quot;width&quot;:1363,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:491140,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rsrl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 424w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 848w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 1272w, https://substackcdn.com/image/fetch/$s_!Rsrl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7dfe97-f4e9-4a15-83ce-93516eeafba8_1363x760.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with the L1 instruction cache, it is 192KB in size with 6-way associativity and is fully coherent. The Fetch Unit can fetch up to 16 4 Byte instructions per cycle for a total fetch bandwidth of 64 Bytes per cycle. The L1 iTLB is an 8-way associative, 256 entry structure that supports 4KB and larger page sizes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QwTr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QwTr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 424w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 848w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 1272w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QwTr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png" width="1366" height="761" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:761,&quot;width&quot;:1366,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:600752,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QwTr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 424w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 848w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 1272w, https://substackcdn.com/image/fetch/$s_!QwTr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe769935b-9d9d-4c9c-ad60-c7cc6a46962f_1366x761.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the Decode, Rename, and Retirement stages, Oryon Gen 3 has increased these stages up to 9 wide from the 8 wide that Oryon Gen 1 was meaning that Oryon Gen 3 can support up to 9 micro-ops retired per cycle. There are over 400 Vector and Integer registers in their respective physical register files which is similar to the number of entries in Oryon Gen 1. Similarly, the Reorder Buffer is similarly 650+ entries for Oryon Gen 3.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-gme!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-gme!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 424w, https://substackcdn.com/image/fetch/$s_!-gme!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 848w, https://substackcdn.com/image/fetch/$s_!-gme!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 1272w, https://substackcdn.com/image/fetch/$s_!-gme!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-gme!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png" width="1380" height="767" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:767,&quot;width&quot;:1380,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:519885,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-gme!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 424w, https://substackcdn.com/image/fetch/$s_!-gme!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 848w, https://substackcdn.com/image/fetch/$s_!-gme!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 1272w, https://substackcdn.com/image/fetch/$s_!-gme!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7b1a56-6efd-42d8-b62e-7cff6215d3af_1380x767.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Delving into the Integer side of the core, Oryon Gen 3 now has 4 Branch units which is a doubling of the number of Branch units found inside Oryon Gen 1. Otherwise, the integer side of Oryon Gen 3 is very similar to Oryon Gen 1 with 6 20-entry Reservation Stations for a total of 120 entries in the Integer scheduler, 6 Integer ALUs with 2 capable of Multiplies and one unit capable of handling Crypto and Division instructions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bj9-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bj9-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 424w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 848w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 1272w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bj9-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png" width="1369" height="758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:758,&quot;width&quot;:1369,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:529441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bj9-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 424w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 848w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 1272w, https://substackcdn.com/image/fetch/$s_!bj9-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0023e16-0b1b-4366-ac3b-50a5735aa10d_1369x758.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Swapping to the Vector unit and Oryon Gen 3 adds SVE and SVE2 support to the core with the high-level layout similar to Oryon Gen 1 with over 400 128b Vector registers, 4 128b Vector ALUs all capable of FMAs, and 4 48-entry Reservation Stations for a total of 192 entries in the Vector scheduler.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zSLJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zSLJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 424w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 848w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 1272w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zSLJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png" width="1366" height="763" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:763,&quot;width&quot;:1366,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:612288,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zSLJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 424w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 848w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 1272w, https://substackcdn.com/image/fetch/$s_!zSLJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc3d26a3-0ab2-48e7-babc-9cad49d8151e_1366x763.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the Load and Store system, Oryon Gen 3 has the same 4 Memory AGUs all of which are capable of loads and stores as Oryon Gen 1. These then feed a 192 entry Load Queue and 56 entry Store Queue which are the same size as the queues found on Oryon Gen 1. The L1 Data Cache is also the same fully coherent 96KB, 6-way. 64 Byte cache line, structure as what Oryon Gen 1 had.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SMu5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SMu5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 424w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 848w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 1272w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SMu5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png" width="1372" height="769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d488228-3977-4f04-960e-29017f586900_1372x769.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:769,&quot;width&quot;:1372,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:636216,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SMu5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 424w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 848w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 1272w, https://substackcdn.com/image/fetch/$s_!SMu5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d488228-3977-4f04-960e-29017f586900_1372x769.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Landing at the Memory Management Unit, and the TLBs of Oryon Gen 3 are again very similar to Oryon Gen 1 with one possible difference. Slide 11 says that the L1 dTLB is a 224 entry, 7-way, structure whereas Slide 12 says that the L1 dTLB is a 256 entry, 8-way, structure. If Slide 12 is correct then this is an increase from Oryon Gen 1&#8217;s 224 entry, 7-way, L1 dTLB. Oryon Gen 3&#8217;s 256 entry, 8-way, L1 iTLB and 8K entry, 8-way, shared L2 is unchanged from Oryon Gen 1. Note, the 2 cycle access for the L2 TLB is the SRAM access time not the total latency for a TLB lookup which Qualcomm wouldn&#8217;t disclose but is a similar range to the ~7 cycles you see on modern x86 cores.</p><h2>Qualcomm&#8217;s Matrix Engine</h2><p>In each of SDX2E&#8217;s 3 clusters lies a SME compatible Matrix Engine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gay_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gay_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 424w, https://substackcdn.com/image/fetch/$s_!gay_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 848w, https://substackcdn.com/image/fetch/$s_!gay_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 1272w, https://substackcdn.com/image/fetch/$s_!gay_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gay_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png" width="1363" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1363,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:671309,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gay_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 424w, https://substackcdn.com/image/fetch/$s_!gay_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 848w, https://substackcdn.com/image/fetch/$s_!gay_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 1272w, https://substackcdn.com/image/fetch/$s_!gay_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7a1f64-9138-4573-8cfb-5a501d825e18_1363x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This matrix unit uses a 64 bit x 64 bit MLA numeric element in a 8x8 or 4x8 grid. This means that this matrix unit is 4096 bit wide which can do up to 128 FP32/INT32, 256 FP16/BF16/INT16, or 512 INT8 operations per cycle. The matrix engine is on a separate clock domain to the Cores and L2 Cache for better power and thermal management of the SoC.</p><h2>Qualcomm Oryon Gen 3 Performance Core and Cluster</h2><p>Something that is new to the SDX2E that SDXE didn&#8217;t have is a 3<sup>rd</sup> cluster on board with what Qualcomm is calling their &#8220;Performance cores&#8221;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7num!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7num!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 424w, https://substackcdn.com/image/fetch/$s_!7num!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 848w, https://substackcdn.com/image/fetch/$s_!7num!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 1272w, https://substackcdn.com/image/fetch/$s_!7num!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7num!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png" width="1364" height="760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:760,&quot;width&quot;:1364,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:608004,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7num!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 424w, https://substackcdn.com/image/fetch/$s_!7num!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 848w, https://substackcdn.com/image/fetch/$s_!7num!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 1272w, https://substackcdn.com/image/fetch/$s_!7num!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ff091f-5330-41dd-9172-9d08c9da2fed_1364x760.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This cluster has the same number of cores along with the Matrix Engine as the Prime Clusters but instead of 16 MB of shared L2, the Performance Cluster has 12 MB of shared L2.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8QRE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8QRE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 424w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 848w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 1272w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8QRE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png" width="1373" height="770" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:629496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8QRE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 424w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 848w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 1272w, https://substackcdn.com/image/fetch/$s_!8QRE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b7f5794-8407-4220-a798-5f3aa3632eb6_1373x770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Performance core is also different to the Prime cores. These cores are of a similar but distinct microarchitecture to the Prime cores. These cores are targeted at a lower power point and has been optimized for operation below 2 watts. As such, the Performance core isn&#8217;t as wide as the Prime core and has fewer execution pipes, smaller caches, and shallower Out-of-Order structures compared to the Prime core.</p><h2>Adreno X2 GPU</h2><p>SDX2E has a revamped GPU architecture that Qualcomm is calling the Adreno X2 microarchitecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WtdU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WtdU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 424w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 848w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 1272w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WtdU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png" width="1366" height="765" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:765,&quot;width&quot;:1366,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:916903,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WtdU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 424w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 848w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 1272w, https://substackcdn.com/image/fetch/$s_!WtdU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa26654-cf3b-4784-9798-04eaec03601a_1366x765.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is Qualcomm&#8217;s largest GPU they have made to date with 2048 FP32 ALUs clocking up to 1.85GHz.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DMhw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DMhw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 424w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 848w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 1272w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DMhw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png" width="1375" height="770" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1375,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:755403,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DMhw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 424w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 848w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 1272w, https://substackcdn.com/image/fetch/$s_!DMhw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ccd7ca-cc09-4db4-832c-2dda3f354aa5_1375x770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Adreno X2 is a &#8220;Slice-Based&#8221; architecture, roughly equivalent to a Shader Engine from AMD or a GPC from Nvidia, with 4 slices for the top-end X2-90. Each slice has one Front-End which capable of rasterizing up to 4 triangles per cycle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u4nX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u4nX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 424w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 848w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 1272w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u4nX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png" width="1377" height="771" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:771,&quot;width&quot;:1377,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:638672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u4nX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 424w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 848w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 1272w, https://substackcdn.com/image/fetch/$s_!u4nX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d6121c-39cd-4c21-9124-fcf074627828_1377x771.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After the Front-end, are the Shader Processors, which are roughly equivalent to AMD&#8217;s WGP or Nvidia&#8217;s SM, which has an instruction cache and 2 micro-Shader Processors (uSP), similar to AMD&#8217;s SIMD unit or Nvidia&#8217;s SMSP, which each have a 128KB register file feeding 128 ALUs which support FP32, FP16, and BF16. A change from the prior Adreno X1 architecture is the removal of Wave128, now the Adreno X2 only supports Wave64 and can dual issue Wave64 instructions in order to keep the 128 ALUs fed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_fKY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_fKY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 424w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 848w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 1272w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_fKY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png" width="1373" height="763" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:763,&quot;width&quot;:1373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:497892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_fKY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 424w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 848w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 1272w, https://substackcdn.com/image/fetch/$s_!_fKY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F739cb928-f78b-4cf1-b1b0-86bb8fb22992_1373x763.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each uSP has a Ray Tracing Unit which supports either 4 ray-triangles or 8 ray-box intersections per cycle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l7Wz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l7Wz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 424w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 848w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 1272w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l7Wz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png" width="1367" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1367,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:901359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l7Wz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 424w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 848w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 1272w, https://substackcdn.com/image/fetch/$s_!l7Wz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f13951-181e-48b9-aaa6-949622e1d813_1367x764.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>From there is what is referred to as Adreno High Performance Memory (AHPM). There is 21 MB of AHPM in a X2-90 GPU, 5.25 MB per slice, which is both a scratchpad and a cache depending on what the driver configures it as. Up to 3 MB of each of the 5.25 MB slices can be configured as a cache with the remaining 2.25 MB of SRAM being a scratchpad.</p><p>AHPM is designed to allow for the GPU to do tiled rendering all within the AHPM before rendering out the frame to the display buffer. This reduces the amount of data movement that the GPU has to do which consequently improves the performance per watt of the Adreno X2 compared to the Adreno X1.</p><p>Moving back to the Slice level, each slice has a 128 KB cluster cache which is then backed by a unified 2 MB L2 cache. This L2 can then spill into the 8 MB System Level Cache (SLC) which then is backed by the up to 228 GB/s memory subsystem.</p><p>As for API support, Adreno X2 supports DX12.2, Shader Model 6.8, Vulkan 1.4, OpenCL 3.0, as well as SYCL support coming in the first half of 2026.</p><h2>Hexagon NPU</h2><p>Qualcomm has increased the performance of the Hexagon NPU from 45 TOPS of INT8 to 80 TOPS of INT8 with SDX2E.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!04wG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!04wG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 424w, https://substackcdn.com/image/fetch/$s_!04wG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 848w, https://substackcdn.com/image/fetch/$s_!04wG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 1272w, https://substackcdn.com/image/fetch/$s_!04wG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!04wG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png" width="1367" height="754" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:754,&quot;width&quot;:1367,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A computer chip on a red background\n\nDescription automatically generated&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A computer chip on a red background

Description automatically generated" title="A computer chip on a red background

Description automatically generated" srcset="https://substackcdn.com/image/fetch/$s_!04wG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 424w, https://substackcdn.com/image/fetch/$s_!04wG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 848w, https://substackcdn.com/image/fetch/$s_!04wG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 1272w, https://substackcdn.com/image/fetch/$s_!04wG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032b2ca5-b111-4799-8a2c-12b8e6b76daf_1367x754.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Qualcomm has also added FP8 and BF16 support to the Hexagon NPU 6 vector unit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TAx6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TAx6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 424w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 848w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 1272w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TAx6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png" width="1456" height="814" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:814,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A screenshot of a computer\n\nDescription automatically generated&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A screenshot of a computer

Description automatically generated" title="A screenshot of a computer

Description automatically generated" srcset="https://substackcdn.com/image/fetch/$s_!TAx6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 424w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 848w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 1272w, https://substackcdn.com/image/fetch/$s_!TAx6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F268a5f8a-5b96-47af-8013-ef83506e0bd8_1600x895.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In addition to the BF16 and FP8 support, the new matrix engine in NPU 6 has INT2 dequantization support.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yH5_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yH5_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 424w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 848w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 1272w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yH5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png" width="1456" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A screenshot of a computer program\n\nDescription automatically generated&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A screenshot of a computer program

Description automatically generated" title="A screenshot of a computer program

Description automatically generated" srcset="https://substackcdn.com/image/fetch/$s_!yH5_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 424w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 848w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 1272w, https://substackcdn.com/image/fetch/$s_!yH5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb88f96d-88d4-4be3-8368-a90b2c5bf200_1600x906.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>However, the largest change in NPU 6 is the addition of 64 bit Virtual Addressing to the DMA unit which means that NPU 6 can now access more than 4GB of memory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vFF8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vFF8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 424w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 848w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 1272w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vFF8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png" width="1456" height="814" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:814,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A screenshot of a computer program\n\nDescription automatically generated&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A screenshot of a computer program

Description automatically generated" title="A screenshot of a computer program

Description automatically generated" srcset="https://substackcdn.com/image/fetch/$s_!vFF8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 424w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 848w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 1272w, https://substackcdn.com/image/fetch/$s_!vFF8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F664a2241-6658-43b4-8e90-5090d92c2391_1600x895.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Power and Performance</h2><p>For testing the power of a system, Qualcomm has used what they call INPP or Idle Normalized Platform Power. What INPP is, is talking the total platform power during load and subtracting out the platform power at idle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ujeo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ujeo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 424w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 848w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 1272w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ujeo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png" width="1357" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1357,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:373120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ujeo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 424w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 848w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 1272w, https://substackcdn.com/image/fetch/$s_!Ujeo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2544088c-c51c-4a3e-9556-cad400d62b65_1357x764.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What INPP gets you is the SOC power plus the DRAM power plus the Power Conversion Losses; while this isn&#8217;t quite solely SOC power, INPP is about as close as you can get to pure SOC power in a laptop form factor where discrete power sensors aren&#8217;t very common.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nqtZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nqtZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 424w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 848w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 1272w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nqtZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png" width="1373" height="770" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:160013,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nqtZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 424w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 848w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 1272w, https://substackcdn.com/image/fetch/$s_!nqtZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70a5ae59-ef96-41af-a388-2e4e32e2bb15_1373x770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Different workloads have different power characteristics. For example, while GB 6 Multi-threaded doesn&#8217;t pull a ton of power overall, it is a very bursty workload that can spike to over 150 watts; whereas a memory bandwidth test pulls over 105 watts in a sustained fashion.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6qB4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6qB4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 424w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 848w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 1272w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6qB4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png" width="1366" height="760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/253233dd-d694-44a7-93b0-f453ea044849_1366x760.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:760,&quot;width&quot;:1366,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146646,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6qB4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 424w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 848w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 1272w, https://substackcdn.com/image/fetch/$s_!6qB4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F253233dd-d694-44a7-93b0-f453ea044849_1366x760.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at the performance versus power graph in Cinebench R24 MT, the SDX2E Extreme with 18 cores (12 Prime and 6 Performance cores) scores just over 1950 points in Cinebench R24 at about 105 watts INPP with the standard SDX2E with 12 cores (6 Prime and 6 Performance cores) scoring just over 1100 points at approximately 50 watts INPP.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FfVJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FfVJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 424w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 848w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 1272w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FfVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png" width="1363" height="775" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:775,&quot;width&quot;:1363,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:765606,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FfVJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 424w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 848w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 1272w, https://substackcdn.com/image/fetch/$s_!FfVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2977e3bd-a3b1-4d8f-9172-e2337a0b4478_1363x775.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Qualcomm has also implemented a clock boosting scheme similar to Intel&#8217;s Turbo Boost where depending on the number of cores active in a cluster, the cluster will clock up or down accordingly.</p><p><strong>Editor&#8217;s Note (11/30/2025): </strong>Qualcomm got in touch with Chips and Cheese to clarify that their boosting algorithm is a cluster-level algorithm where the boosting behavior is independent for the 3 clusters; where as Intel&#8217;s Turbo Boost mechanism is a SoC-level algorithm that affects all cores on the package.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A_8s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A_8s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 424w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 848w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 1272w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A_8s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png" width="1358" height="759" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:759,&quot;width&quot;:1358,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:312425,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/179173087?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A_8s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 424w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 848w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 1272w, https://substackcdn.com/image/fetch/$s_!A_8s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83bf50bf-8979-4f3f-8d3d-5ff720745468_1358x759.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Qualcomm also highlighted the performance of the SDX2E when the laptop is on battery performance compared to the laptop on wall power.</p><h2>Conclusion</h2><p>Qualcomm has made significant advances with the SDX2E with regards to the SOC, GPU, and NPU. SDX2E is planned to hit shelves in the first half of 2026 and we can&#8217;t wait to get a system to test.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Strix Halo’s Memory Subsystem: Tackling iGPU Challenges]]></title><description><![CDATA[Editor&#8217;s Note (11/2/2025): Due to an error in moving the article over from Google Docs to Substack, the &#8220;Balancing CPU and GPU Bandwidth Demands&#8221; section was missing some Cyberpunk 2077 data.]]></description><link>https://chipsandcheese.com/p/strix-halos-memory-subsystem-tackling</link><guid isPermaLink="false">https://chipsandcheese.com/p/strix-halos-memory-subsystem-tackling</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Fri, 31 Oct 2025 21:50:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Uwb0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Editor&#8217;s Note (11/2/2025): </strong>Due to an error in moving the article over from Google Docs to Substack, the &#8220;Balancing CPU and GPU Bandwidth Demands&#8221; section was missing some Cyberpunk 2077 data. Apologizes for the mistake!</p><p>AMD&#8217;s Strix Halo aspires to deliver high CPU and GPU performance within a mobile device. Doing so presents the memory subsystem with a complicated set of demands. CPU applications are often latency sensitive with low bandwidth demands. GPU workloads are often latency tolerant and bandwidth hungry. Then, multitasking requires high memory capacity. Mobile devices need low power draw. Finally, the whole package has to fit within a price tag acceptable to consumers. Investigating how AMD dealt with those challenges should make for a good time.</p><h1>Credit</h1><p>ASUS has kindly sampled the ROG Flow Z13, which implements Strix Halo in a tablet form factor with 32 GB of LPDDR5X. They&#8217;ve made deep dives like this possible, and we greatly appreciate their support.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uwb0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uwb0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uwb0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg" width="1456" height="969" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:969,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uwb0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Uwb0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8f4aa5e-9bdd-48c9-a117-7d5e4f94e963_1600x1065.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>RX 7600 results were provided by Azralee from the Chips and Cheese Discord.</p><h1>GPU</h1><p>Strix Halo&#8217;s GPU uses a similar cache setup to AMD&#8217;s older and smaller mobile chips. As on Strix Point and Hawk Point (Zen 4 mobile), Strix Halo&#8217;s GPU is split into two Shader Arrays. Each Shader Array has 256 KB of L1 mid-level cache, and a 2 MB L2 services the entire GPU. Latencies to those GPU-private caches are in line with other RDNA3 and RDNA3.5 implementations. AMD likely kept L2 capacity at 2 MB because a 32 MB memory side cache (Infinity Cache, or MALL) takes over as the GPU&#8217;s last level cache.The L2 only has to catch enough traffic to prevent the Infinity Cache from getting overwhelmed. The resulting cache setup is similar to the one in the RX 7600, a lower midrange RDNA3 discrete card.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j9gt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j9gt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 424w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 848w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 1272w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j9gt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png" width="1456" height="691" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:691,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j9gt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 424w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 848w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 1272w, https://substackcdn.com/image/fetch/$s_!j9gt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69e76eac-5688-427c-8814-095e1b0e5229_1600x759.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Infinity Cache on Strix Halo has slightly higher latency compared to implementations in AMD&#8217;s discrete cards. DRAM latency from the GPU is higher as well. Compared to AMD&#8217;s other mobile CPUs with iGPUs though, the 32 MB Infinity Cache offers a large cache capacity increase.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k4OO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k4OO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 424w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 848w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 1272w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k4OO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png" width="757" height="227" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:227,&quot;width&quot;:757,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k4OO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 424w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 848w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 1272w, https://substackcdn.com/image/fetch/$s_!k4OO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84bd84e-9e8f-433e-87e9-25b6a35a5447_757x227.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Nemes&#8217;s Vulkan bandwidth test achieves just under 1 TB/s from Infinity Cache. The figures align well with performance counter data. Taken together with the chip&#8217;s 2 GHz FCLK, bandwidth test results suggest the GPU has a 512B/cycle path to the interconnect. If so, each of the GPU&#8217;s eight Infinity Fabric endpoints has a 64B/cycle link.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zex1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zex1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 424w, https://substackcdn.com/image/fetch/$s_!zex1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 848w, https://substackcdn.com/image/fetch/$s_!zex1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 1272w, https://substackcdn.com/image/fetch/$s_!zex1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zex1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png" width="1456" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zex1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 424w, https://substackcdn.com/image/fetch/$s_!zex1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 848w, https://substackcdn.com/image/fetch/$s_!zex1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 1272w, https://substackcdn.com/image/fetch/$s_!zex1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d28eb41-c911-43e2-a130-0062677d4282_1600x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As a memory side cache, Infinity Cache can theoretically handle any access to physical addresses backed by DRAM. In an earlier interview with Cheese (George), AMD indicated that Infinity Cache was focused on the GPU, and that its behavior could change with firmware releases. Some of that change has happened already. When I first started testing Strix Halo just after Hot Chips 2025, results from my OpenCL microbenchmarks reflected Infinity Cache&#8217;s presence. I used that OpenCL code to figure out Data Fabric performance events. But PMU data collected from games suggested Infinity Cache wasn&#8217;t used once a game went into the background. Hardware doesn&#8217;t know whether a process is running in the foreground or background. That&#8217;s something the operating system knows, and that info would have to be communicated to hardware via drivers. Therefore, Infinity Cache policy can change on the fly from software control.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mk7F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mk7F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 424w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 848w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 1272w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mk7F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png" width="1180" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:1180,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mk7F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 424w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 848w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 1272w, https://substackcdn.com/image/fetch/$s_!mk7F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa80d46a9-6c53-4edd-bfe4-134d0fdd4209_1180x400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From early data collected on 9/1/2025</figcaption></figure></div><p>At that time, Nemes&#8217;s Vulkan-based code didn&#8217;t reflect Infinity Cache&#8217;s presence. PMU data showed a match between CS and UMC traffic, indicating the microbenchmark wasn&#8217;t taking advantage of Infinity Cache rather than the cache struggling with the access pattern. I was in the middle of investigating what Infinity Cache did or didn&#8217;t apply to when Windows updated. Then, foreground/background status no longer had any effect. Nemes&#8217;s Vulkan code was also able to observe the Infinity Cache.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xt-3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xt-3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 424w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 848w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 1272w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xt-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png" width="1456" height="740" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:740,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xt-3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 424w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 848w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 1272w, https://substackcdn.com/image/fetch/$s_!xt-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9256a18c-10ba-484e-8b27-3f641c508761_1600x813.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Early observations on Infinity Cache behavior aren&#8217;t relevant today, but they do show Infinity Cache&#8217;s behavior is influenced by factors beyond a memory request&#8217;s origination point. Not all GPU requests install into the cache, and AMD can change cache policy on the fly. AMD could tune behavior with future updates too.</p><p>One early observation from OpenCL remained consistent though. Infinity Cache isn&#8217;t used for a buffer created with the CL_MEM_ALLOC_HOST_PTR flag and managed with zero-copy map/unmap APIs. CL_MEM_ALLOC_HOST_PTR requests an allocation from host-visible memory. On systems with discrete GPUs, AMD tends to handle that by allocating memory from DRAM attached to the CPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LA-3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LA-3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 424w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 848w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 1272w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LA-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png" width="1456" height="691" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:691,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LA-3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 424w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 848w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 1272w, https://substackcdn.com/image/fetch/$s_!LA-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbe8d7f0-eaa7-4a7e-b28b-7ecf73f3fba3_1600x759.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Intuitively, that flag shouldn&#8217;t make a difference on integrated GPUs. I&#8217;m not sure why it affects Infinity Cache behavior. Perhaps Strix Halo splits address ranges for the CPU and GPU under the hood, and the CPU&#8217;s address ranges aren&#8217;t cacheable from the Infinity Cache&#8217;s perspective.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r-Ge!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r-Ge!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 424w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 848w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 1272w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r-Ge!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png" width="1456" height="715" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:715,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r-Ge!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 424w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 848w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 1272w, https://substackcdn.com/image/fetch/$s_!r-Ge!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13473e9c-efe3-4c4c-ad40-3700c167121f_1600x786.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AMD&#8217;s discrete Radeon RX 9070 shows similar behavior, with Infinity Cache not being used for host-side memory. Latency to host memory goes up to nearly a microsecond on RX 9070, while it remains unchanged on Strix Halo. Integrated GPUs have an advantage with zero-copy compute code, and it shows.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JonC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JonC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 424w, https://substackcdn.com/image/fetch/$s_!JonC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 848w, https://substackcdn.com/image/fetch/$s_!JonC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 1272w, https://substackcdn.com/image/fetch/$s_!JonC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JonC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png" width="746" height="148" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/699972d4-d810-4dfd-a876-3a177e52453c_746x148.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:148,&quot;width&quot;:746,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JonC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 424w, https://substackcdn.com/image/fetch/$s_!JonC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 848w, https://substackcdn.com/image/fetch/$s_!JonC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 1272w, https://substackcdn.com/image/fetch/$s_!JonC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F699972d4-d810-4dfd-a876-3a177e52453c_746x148.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>To further check zero-copy behavior, I have a test that allocates a 256 MB buffer using OpenCL&#8217;s Shared Virtual Memory APIs and only modifies a single 32-bit value. Strix Halo supports fine-grained buffer sharing like other recent AMD GPUs, meaning applications can use results generated from the GPU without calling map/unmap functions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Slh8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Slh8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 424w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 848w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Slh8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Slh8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 424w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 848w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 1272w, https://substackcdn.com/image/fetch/$s_!Slh8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd285ac0e-d56c-4adb-ad7e-dc43281f2c10_1471x794.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Strix Halo shows low latencies in line with zero-copy behavior. It&#8217;s worth noting that not all integrated GPUs can avoid a copy under the hood.</p><p>Copy APIs like clEnqueueReadBuffer and clEnqueueWriteBuffer are still relevant, because they&#8217;re the traditional way to work with discrete GPUs. Those APIs often use the copy queue and DMA engines, which handle data movement without involving general purpose compute units. Strix Halo can achieve high copy bandwidth in the CPU to GPU direction, but not the other way around.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!goLS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!goLS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 424w, https://substackcdn.com/image/fetch/$s_!goLS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 848w, https://substackcdn.com/image/fetch/$s_!goLS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 1272w, https://substackcdn.com/image/fetch/$s_!goLS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!goLS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png" width="1108" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f91989e-707d-432c-a025-044160346258_1108x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!goLS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 424w, https://substackcdn.com/image/fetch/$s_!goLS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 848w, https://substackcdn.com/image/fetch/$s_!goLS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 1272w, https://substackcdn.com/image/fetch/$s_!goLS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f91989e-707d-432c-a025-044160346258_1108x583.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Performance counter data suggests copies to the GPU don&#8217;t go through the Infinity Cache. During a copy, the shared memory controllers should observe both a read from CPU-side memory and a write to GPU-side memory. But there&#8217;s nowhere near 100% overhead compared to software measurements.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KuDp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KuDp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 424w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 848w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 1272w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KuDp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png" width="1116" height="580" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:580,&quot;width&quot;:1116,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KuDp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 424w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 848w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 1272w, https://substackcdn.com/image/fetch/$s_!KuDp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3409a0aa-e598-4435-b185-fb40d8dd58ef_1116x580.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Bandwidth is lower in the other direction, but curiously CS-level bandwidth is similar. The memory controllers see less bandwidth, indicating some requests were handled on-chip, likely by Infinity Cache. Curiously, there&#8217;s way more than 100% overhead when comparing PMU data to software-visible copy bandwidth.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9yZC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9yZC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 424w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 848w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 1272w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9yZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png" width="1118" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:1118,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9yZC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 424w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 848w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 1272w, https://substackcdn.com/image/fetch/$s_!9yZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c371c8-c0cf-4e48-910d-6f6b006d2b99_1118x583.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>CPU</h1><p>Strix Halo&#8217;s CPU side superficially resembles AMD&#8217;s flagship desktop parts, with 16 Zen 5 cores split across two Core Complex Dies (CCDs). However, these CCDs use TSMC&#8217;s InFO_oS for connectivity to the IO die rather than on-PCB traces. The CCD has 32B/cycle of bandwidth to the system in both the read and write directions.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4HRw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4HRw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 424w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 848w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 1272w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4HRw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png" width="600" height="137" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:137,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4HRw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 424w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 848w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 1272w, https://substackcdn.com/image/fetch/$s_!4HRw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4baf47ea-53a4-4ab8-b015-c2f7fd4632cd_600x137.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Therefore, Strix Halo&#8217;s CCDs have more bandwidth at the die boundary than their desktop counterparts, but only in the write direction. It&#8217;s an advantage that&#8217;s likely to have minimal impact because reads often outnumber writes by a large margin.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lMbE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lMbE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 424w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 848w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 1272w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lMbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png" width="1309" height="602" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:602,&quot;width&quot;:1309,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lMbE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 424w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 848w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 1272w, https://substackcdn.com/image/fetch/$s_!lMbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5574f149-b579-4437-89f9-8ebf8012b210_1309x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Other CPU chiplet designs have more bandwidth at die boundaries, including the Compute Tile on Intel&#8217;s Meteor Lake and AMD&#8217;s own &#8220;GMI-Wide&#8221; configuration. GMI-Wide uses two links between the CCD and IO die to maximize cross-die bandwidth in lower core count server chips. Even though GMI-Wide doesn&#8217;t use advanced packaging, it has significantly more cross-die bandwidth than Strix Halo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xJDB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xJDB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 424w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 848w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 1272w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xJDB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png" width="1321" height="705" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:705,&quot;width&quot;:1321,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xJDB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 424w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 848w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 1272w, https://substackcdn.com/image/fetch/$s_!xJDB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7078095f-9ff2-4516-b9ca-0beda352a02d_1321x705.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Add = adding a constant to an array, creating a read-modify-write pattern with an equal amount of reads and writes. NT Write = non-temporal writes. These bypass cache and in many CPUs trigger a special case that avoids read-for-ownership for cachelines that are entirely overwritten</figcaption></figure></div><p>In a loaded latency test with reads, a Strix Halo CCD can reach high bandwidth levels at lower latency than standard GMI-Narrow CCDs. Part of that is likely down to its high bandwidth LPDDR5X setup, which a single CCD can&#8217;t come close to saturating. But that advantage doesn&#8217;t come through until bandwidth loads pass 45-55 GB/s. Before that, LPDDR5X&#8217;s high baseline latency puts Strix Halo at a disadvantage. At very high bandwidth load, Intel Meteor Lake&#8217;s higher cross-die bandwidth keeps it ahead. AMD&#8217;s GMI-Wide setup shows what a bandwidth-focused cross-die link can do, providing excellent bandwidth at low latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wpBY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wpBY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 424w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 848w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 1272w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wpBY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png" width="1456" height="669" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:669,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wpBY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 424w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 848w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 1272w, https://substackcdn.com/image/fetch/$s_!wpBY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa000e417-00b9-4ed5-9da4-96164a1baceb_1600x735.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Bringing both CCDs into play gives Strix Halo a lead over Meteor Lake. I&#8217;m starting the test by placing bandwidth load on CCD1 while running the latency test on CCD0. That gives lower latency at bandwidth loads below 60 GB/s because contention at the CCD interface is taken out of the picture. Latency does increase as I spread bandwidth load across both dies, and rises beyond 200 ns as the test approaches die-to-die bandwidth limits. However, a read-only pattern is still limited by cross-die bandwidth and falls far short of the 256 GB/s that the LPDDR5X setup is theoretically capable of.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ckIK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ckIK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 424w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 848w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 1272w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ckIK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png" width="1456" height="757" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:757,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ckIK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 424w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 848w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 1272w, https://substackcdn.com/image/fetch/$s_!ckIK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e6e686c-39be-491e-997d-99a49a7f7aff_1457x758.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Advanced packaging may provide latency benefits too. Regular AMD CCDs use SerDes (serializer-deserializer) blocks, which convert signals for transport over lower quality PCB traces. Zen 2&#8217;s Infinity Fabric On-Package (IFOP) SerDes for example uses 32 transmit and 40 receive lanes running at a very high clock. Forwarded clock signals per lane data bundle help tackle clock skew that comes up with high speed parallel transmission over wires of unequal lengths. CRC helps ensure data integrity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JCVU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JCVU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 424w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 848w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 1272w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JCVU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png" width="860" height="709" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:709,&quot;width&quot;:860,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JCVU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 424w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 848w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 1272w, https://substackcdn.com/image/fetch/$s_!JCVU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44778945-0073-42eb-8b1d-d3ff2dcace9f_860x709.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All of that adds power and latency overhead. Strix Halo&#8217;s InFO_oS packaging doesn&#8217;t require SerDes. But any latency advantage is difficult to observe in practice. DRAM requests are the most common type of off-CCD traffic. High LPDDR5X latency masks any latency advantage when looking at DRAM requests, as shown above. Cache coherency traffic is another form of off-CCD traffic, and doesn&#8217;t involve DRAM. However, testing that with a &#8220;core to core latency&#8221; test that bounces cachelines between core pairs also doesn&#8217;t provide favorable results for Strix Halo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P1G_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P1G_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 424w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 848w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 1272w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P1G_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png" width="1456" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P1G_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 424w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 848w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 1272w, https://substackcdn.com/image/fetch/$s_!P1G_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81ce08f2-1ee4-4034-bec0-e0ca0d53c49c_1600x784.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A run that produced good cross-CCX latencies</figcaption></figure></div><p>AMD handles cross-CCX cache coherency at Coherent Stations (CS-es) that sit right in front of the memory controllers. Memory traffic is interleaved across memory channels and thus CS instances based on their physical address. I try hitting different physical addresses by testing with various cacheline offsets into a 4 KB page, which gives me different combinations of L3 slices and memory controller + CS pairs. Values within a single run reflect variation based on the tested core pair, while different runs display variation from different memory subsystem blocks owning the tested address.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7xPq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7xPq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 424w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 848w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 1272w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7xPq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png" width="1456" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec243735-86d4-40db-8231-f4aef065f874_1600x784.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7xPq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 424w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 848w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 1272w, https://substackcdn.com/image/fetch/$s_!7xPq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec243735-86d4-40db-8231-f4aef065f874_1600x784.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A run on the worse end with respect to cross-CCX latencies, likely hitting a CS farther away from the CPU endpoints</figcaption></figure></div><p>Cross-CCX latencies on Strix Halo land in the 100-120 ns range depending on the location of the tested core pair, responsible L3 slice, and responsible CS. It&#8217;s significantly higher on typical desktop systems or prior mobile chips from AMD. For example, the Ryzen 9 9900X tends to have cross-CCX latencies in the 80-90 ns range, which is in line with prior Zen generations. It&#8217;s about 20 ns faster than Strix Halo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_RZY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_RZY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 424w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 848w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 1272w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_RZY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png" width="1001" height="521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:521,&quot;width&quot;:1001,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_RZY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 424w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 848w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 1272w, https://substackcdn.com/image/fetch/$s_!_RZY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0308ce2-dd55-44d3-873e-c3c297da1337_1001x521.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Therefore, I don&#8217;t have a satisfactory answer about Strix Halo&#8217;s cross-die latency. Latency may indeed be lower at die boundaries. But everything past that boundary has higher latency compared to other client systems, making any advantage invisible to software.</p><h1>Balancing CPU and GPU Bandwidth Demands</h1><p>Sharing a memory controller across the CPU and GPU comes with advantages, like making zero-copy behavior more natural to pull off. But it comes with challenges too. CPU and GPU memory requests can contend with each other for DRAM access. Contention surfaces as higher latency. From Zen 4 onward, AMD&#8217;s L3 performance monitoring unit (PMU) can measure average latency in nanoseconds for requests external to the core cluster. PMU data isn&#8217;t directly comparable to software measurements, because it only accounts for latency after the point of a L3 miss. But it is consistent in slightly under-estimating software observed latency when running a simple latency microbenchmark. When gaming, I typically see low CPU bandwidth demands and <a href="https://chipsandcheese.com/p/running-gaming-workloads-through">correspondingly mild latency increases over the baseline</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SJbO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SJbO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 424w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 848w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 1272w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SJbO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png" width="1456" height="1005" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1005,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SJbO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 424w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 848w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 1272w, https://substackcdn.com/image/fetch/$s_!SJbO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4b7653a-d997-410d-90a0-f46f1b0fdb39_1479x1021.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The same doesn&#8217;t hold true when gaming on Strix Halo&#8217;s integrated GPU. Latency rises far above the baseline of around 140 ns. I logged average latency over 1 second intervals, and many of those intervals saw latency figures around 200 ns across several games</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!52q5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!52q5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 424w, https://substackcdn.com/image/fetch/$s_!52q5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 848w, https://substackcdn.com/image/fetch/$s_!52q5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 1272w, https://substackcdn.com/image/fetch/$s_!52q5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!52q5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png" width="1456" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!52q5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 424w, https://substackcdn.com/image/fetch/$s_!52q5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 848w, https://substackcdn.com/image/fetch/$s_!52q5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 1272w, https://substackcdn.com/image/fetch/$s_!52q5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8b90f48-f990-4e29-808f-cc7fbe023549_1572x767.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I wrote a microbenchmark to investigate how CPU memory latency is impacted by GPU-sde bandwidth load. As with the CPU loaded latency test, I run a latency test thread on a CPU core. But instead of using a read-only pattern, I do a standard C=A+B computation across large arrays on the GPU. To control GPU bandwidth load, I can have each OpenCL kernel invocation do more math with A and B before writing the result to C. Results show increased latency at higher GPU bandwidth demands. Other recent iGPUs show similar behavior.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6N1M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6N1M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6N1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png" width="1456" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7494f44-fe40-493d-8e44-686407e80804_1600x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6N1M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!6N1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7494f44-fe40-493d-8e44-686407e80804_1600x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In-game CPU bandwidth demands are low, but not as low as a simple latency test. I tried running a couple of read bandwidth threads on top of the test above. Strix Halo seems to let its GPU squeeze out the CPU when under extreme bandwidth demands. Latency suffers, passing 300 ns at one point.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Euuc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Euuc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Euuc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png" width="1456" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df3465a5-0088-414f-921b-d520c714224a_1600x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Euuc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 424w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 848w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 1272w, https://substackcdn.com/image/fetch/$s_!Euuc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf3465a5-0088-414f-921b-d520c714224a_1600x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Plotting L3 and memory controller PMU data with 1 second intervals helps capture the relationship between latency and bandwidth usage in more complex workloads. The points don&#8217;t track well with microbenchmark data collected with a single CPU-side latency test thread. Perhaps there&#8217;s enough CPU-side bandwidth demand to cause contention at both the die-to-die interface and the memory controllers. Or maybe, CPU and GPU bandwidth spikes tend to line up within those 1 second intervals. Whatever the case, PMU data highlights how Strix Halo&#8217;s CPU cores need high cache hitrates more than their desktop counterparts.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rkus!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rkus!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 424w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 848w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 1272w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rkus!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png" width="1076" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1076,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rkus!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 424w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 848w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 1272w, https://substackcdn.com/image/fetch/$s_!Rkus!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731b2da5-b17a-4b66-8c98-3724f64c5c9e_1076x591.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cyberpunk 2077&#8217;s built-in benchmark is largely CPU bound when run at 1080P with medium settings and no upscaling. I used Intel&#8217;s Arc B580 on desktop systems, since it has vaguely similar compute power to Strix Halo&#8217;s iGPU. Results show a large gap between Strix Halo and AMD&#8217;s desktop platform, even though both use the same Zen 5 cores.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zj6i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zj6i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 424w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 848w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 1272w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zj6i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png" width="1456" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zj6i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 424w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 848w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 1272w, https://substackcdn.com/image/fetch/$s_!Zj6i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffdd6560-4a4d-4e45-a46a-51ed8f70fcff_1600x857.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Memory latency under load is largely not a problem with CPU-only workloads, even when considering heavily multithreaded ones. Total bandwidth demands are much lower and actually well within the capabilities of a 128-bit DDR5 setup. That explains why AMD was able to take on quad channel HEDT parts using a desktop dual channel platform back in the Zen 2 days. Good caching likely played a role, and Strix Halo continues to have 4 MB of last level cache per core. PMU data from Cinebench, code compilation, and AV1 video encoding loosely align with microbenchmark results. Latency barely strays above the baseline. Y-Cruncher is an exception. It&#8217;s very bandwidth hungry and not cache friendly. Its bandwidth demands are several times higher, and often go beyond a dual channel DDR5-5600 setup&#8217;s capabilities. Strix Halo is a good choice for that type of workload. But in the client space, bandwidth hungry CPU applications tend to be exceptions.</p><h1>Final Words: A GPU with an Integrated CPU?</h1><p>Observations above suggest Strix Halo&#8217;s Infinity Fabric and DRAM setup focuses on feeding the GPU and as a result the CPU gets the short end of the stick. High Infinity Fabric endpoint count and a wide LPDDR5X bus provide high bandwidth at high latency. CPU workloads tend to be latency sensitive and contention can make that even worse.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6z0B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6z0B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 424w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 848w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 1272w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6z0B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png" width="1456" height="670" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:670,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6z0B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 424w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 848w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 1272w, https://substackcdn.com/image/fetch/$s_!6z0B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0421d907-b5ef-4b1b-8435-fef2ade2a74f_1600x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Strix Halo shows AMD can move hundreds of gigabytes per second over Infinity Fabric within mobile power budgets. It&#8217;s impressive in that respect, but CPU-side latency is high</figcaption></figure></div><p>Other aspects of the memory subsystem de-prioritize the CPU as well. CPU accesses don&#8217;t fill into that cache, but<a href="https://chipsandcheese.com/p/amds-strix-halo-under-the-hood"> still do a lookup</a> likely to maintain cache coherency with the GPU. That cache lookup at the very least costs power and might add latency, even though it&#8217;ll almost never result in a hit. Lack of GMI-Wide style bandwidth is another example.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QzcR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QzcR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QzcR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg" width="1456" height="1158" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1158,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QzcR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QzcR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4eb2142-5196-4b55-a64f-370c528d122e_1600x1273.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">ASUS&#8217;s ROG Flow Z13 placed next to the original Surface Pro for comparison. A hypothetical larger iGPU would be difficult to accommodate in such a form factor, and would face stiff competition from discrete GPU setups</figcaption></figure></div><p>AMD&#8217;s decisions are understandable. Most client workloads have light bandwidth requirements. Strix Halo&#8217;s memory system design lets it perform well in portable gaming devices like the ROG Flow Z13. But it does make tradeoffs. And extrapolating from those tradeoffs suggests iGPU designs will face steeper challenges at higher performance tiers.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4OuG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4OuG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 424w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 848w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 1272w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4OuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png" width="601" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c244480-b997-4180-9c1e-af3a2559f365_601x216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:216,&quot;width&quot;:601,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4OuG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 424w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 848w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 1272w, https://substackcdn.com/image/fetch/$s_!4OuG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c244480-b997-4180-9c1e-af3a2559f365_601x216.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>For its part, Strix Halo strikes a good balance. It enjoys iGPU advantages without being large enough for the disadvantages to hurt. I hope AMD continues to target Strix Halo&#8217;s market segment with updated designs, and look forward to seeing where they go next.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Evaluating the Infinity Cache in AMD Strix Halo]]></title><description><![CDATA[Strix Halo is the codename for AMD&#8217;s highest end mobile chip, which is used in the Ryzen AI MAX series.]]></description><link>https://chipsandcheese.com/p/evaluating-the-infinity-cache-in</link><guid isPermaLink="false">https://chipsandcheese.com/p/evaluating-the-infinity-cache-in</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Wed, 22 Oct 2025 03:58:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Il1z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Strix Halo is the codename for AMD&#8217;s highest end mobile chip, which is used in the Ryzen AI MAX series. It combines a powerful CPU with 16 Zen 5 cores and a large GPU with 20 RDNA 3.5 Workgroup Processors (WGPs). The sizeable iGPU makes Strix Halo particularly interesting because GPUs have high bandwidth requirements. Strix Halo tackles that with a 256-bit LPDDR5X-8000 setup combined with 32 MB of memory side cache. The latter is often referred to as Infinity Cache, or MALL (Memory Attached Last Level). I&#8217;ll refer to it as Infinity Cache for brevity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Il1z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Il1z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 424w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 848w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 1272w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Il1z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png" width="847" height="714" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:714,&quot;width&quot;:847,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Il1z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 424w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 848w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 1272w, https://substackcdn.com/image/fetch/$s_!Il1z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bdb0948-6a82-4441-927a-4dcdce851029_847x714.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Infinity Cache has been around since RDNA2 in AMD&#8217;s discrete consumer GPU lineup, where it helped AMD hit high performance targets with lower DRAM bandwidth requirements. However, Infinity Cache&#8217;s efficacy has so far been difficult for me to evaluate. AMD&#8217;s discrete GPUs have performance monitoring facilities accessible through AMD&#8217;s developer tools. But those tools stop providing information past L2. Strix Halo stands out because it has an Infinity Cache implementation, and all the accessible performance monitoring features typical of a recent AMD GPU. That includes programmable performance counters at Infinity Fabric and memory controllers. It&#8217;s an opportunity to finally get insight into how well AMD&#8217;s Infinity Cache does its job in various graphics workloads.</p><h1>Acknowledgements</h1><p>Special thanks goes out to ASUS for sampling their ROG Flow Z13. This device implements AMD&#8217;s Ryzen AI MAX+ 395 with 32 GB of LPDDR5X in a thin and light form factor. It superficially represents a convertible tablet from Microsoft&#8217;s Surface line, and is remarkably portable for a device with gaming credentials. Without ASUS&#8217;s help, this article wouldn&#8217;t have been possible.</p><h1>Infinity Fabric, Performance Monitoring, and Theory</h1><p>AMD&#8217;s Infinity Fabric aims to abstract away details of how data travels across the chip. It does so by providing endpoints with well defined interfaces to let blocks make or handle memory requests. Infinity Fabric also provides a set of programmable performance counters. AMD documents a single DATA_BW performance event that counts data beats at endpoints. DATA_BW targets an endpoint via its 8-bit instance ID, and can count either reads or writes. AMD never documented Infinity Fabric instance IDs for Strix Halo. So, I did some guessing by generating traffic at various blocks and observing bandwidth counts at all possible instance IDs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cH9c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cH9c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 424w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 848w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 1272w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cH9c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png" width="959" height="411" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9c7e404-6987-45a7-8816-916580073352_959x411.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:411,&quot;width&quot;:959,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cH9c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 424w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 848w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 1272w, https://substackcdn.com/image/fetch/$s_!cH9c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7e404-6987-45a7-8816-916580073352_959x411.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Instance IDs start from the Coherent Stations (CS-es), just like on server platforms. CS blocks sit in front of memory controllers and ensure cache coherency by probing another block if it might have a modified copy of the requested cacheline. If it doesn&#8217;t, which is most of the time, the CS will pass the request on to its attached Unified Memory Controller (UMC). Because CS blocks observe all requests to physical memory backed by DRAM, it&#8217;s a logical place to implement a memory side cache. That&#8217;s exactly what AMD does on chips with Infinity Cache. Cache hits let the CS avoid going to the UMC.</p><p>Strix Halo has 16 memory controllers and CS instances, each handling a 16-bit LPDDR5X channel. The GPU occupies the next eight instance IDs, suggesting it has a wide interface to Infinity Fabric. CPU core clusters come next. Each octa-core Zen 5 Core Complex (CCX) connects to Infinity Fabric via one endpoint. Miscellaneous blocks follow. These include the NPU, media engine, the display engine, and a mystery.</p><h2>Setting up Performance Monitoring</h2><p>Cache misses usually cause a request to the next level in the memory hierarchy, while cache hits do not. Therefore, my idea is to compare traffic levels at the CS and UMC levels. Traffic that shows up at the CS but not at the UMCs can be used as a proxy for Infinity Cache hits. There are a few problems with that approach though.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FxOO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FxOO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 424w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 848w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 1272w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FxOO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png" width="1101" height="358" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:358,&quot;width&quot;:1101,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FxOO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 424w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 848w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 1272w, https://substackcdn.com/image/fetch/$s_!FxOO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07bf475d-4ec5-4a2b-b748-bb0094222fc8_1101x358.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>First, Strix Halo only provides eight Infinity Fabric performance counters. I have to use two counters per endpoint to count both read and write data beats, so I can only monitor four CS-es simultaneously. Memory traffic is interleaved across channels, so taking bandwidth observed at four CS-es and multiplying by four should give a reasonably accurate estimate of overall bandwidth. But interleaving isn&#8217;t perfectly even in practice, so I expect a few percentage points of error. The UMC situation is easier. Each UMC has its own set of four performance counters, letting me monitor all of them at the same time. I&#8217;m using one counter per UMC to count Column Address Strobe (CAS) commands. Other available UMC events allow monitoring memory controller frequency, bus utilization, and ACTIVATE or PRECHARGE commands. But that stuff goes beyond what I want to focus on here, and collecting more data points would make for annoyingly large spreadsheets.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HXhV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HXhV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 424w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 848w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 1272w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HXhV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png" width="800" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:216,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HXhV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 424w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 848w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 1272w, https://substackcdn.com/image/fetch/$s_!HXhV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2caf6cd2-64db-48f1-bd75-d9fbf4cd3632_800x216.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Cross-CCX traffic presents a second source of error as mentioned above. If a CS sends a probe that hits a modified line, it won&#8217;t send a request to the UMC. The request is still being satisfied out of cache, just not the Infinity Cache that I&#8217;m interested in. I expect this to be rare because cross-CCX traffic in general is usually low compared to requests satisfied from DRAM. It&#8217;ll be especially rare in the graphics workloads I&#8217;m targeting here because Strix Halo&#8217;s second CCD tends to be parked in Windows.</p><p>CPU-side traffic in general presents a more significant issue. Strix Halo&#8217;s Infinity Cache narrowly targets the GPU, and <a href="https://chipsandcheese.com/p/amds-strix-halo-under-the-hood">only GPU-side memory requests cause cache fills</a>. CPU memory accesses will be counted as misses in a hitrate calculation. That may be technically correct, but misses the point of Infinity Cache. Most memory traffic comes from the GPU, but there&#8217;s enough CPU-side traffic to create a bit more of an error bar than I&#8217;d like. Therefore, I want to focus on whether Infinity Cache is handling enough traffic to avoid hitting DRAM bandwidth bottlenecks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aNVH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aNVH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 424w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 848w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 1272w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aNVH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png" width="1412" height="451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:451,&quot;width&quot;:1412,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aNVH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 424w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 848w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 1272w, https://substackcdn.com/image/fetch/$s_!aNVH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7b48d2a-ec15-44ef-bfba-cd5edcffd261_1412x451.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Traffic captured at the CCD endpoints and two of the GPU&#8217;s (multiplied by 4)</figcaption></figure></div><p>A final limitation is that I&#8217;m sampling performance counters with a tool I wrote years ago. I wrote it so I could look at hardware performance stats just like how I like looking at Task Manager or other monitoring utilities to see how my system is doing. Because I&#8217;m updating a graphical interface, I sample data every second and thus have lower resolution compared to some vendor tools. Those can often sample at millisecond granularity or better. That opens the window to underestimating bandwidth demands if a bandwidth spike only occurs over a small fraction of a second. In theory I could write another tool. But Chips and Cheese is an unpaid hobby project that has to be balanced with a separate full time job as well as other free time interests. Quickly rigging an existing project to do my bidding makes more efficient use of time, because there&#8217;s never enough time to do everything I want.</p><h1>Results</h1><p>I logged data over some arbitrarily selected graphics workloads, then selected the 1-second interval with the highest DRAM bandwidth usage. That should give an idea of whether Strix Halo comes close to reaching DRAM bandwidth limits. Overall, the 32 MB cache seems to do its job. Strix Halo is able to stay well clear of the 256 GB/s theoretical bandwidth limit that its LPDDR5X-8000 setup can deliver.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C1NN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C1NN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 424w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 848w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 1272w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C1NN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png" width="1254" height="765" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:765,&quot;width&quot;:1254,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C1NN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 424w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 848w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 1272w, https://substackcdn.com/image/fetch/$s_!C1NN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49e44e20-a4f7-46ad-9eff-3849449cf6d6_1254x765.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Approximate CS-side bandwidth demands do indicate several workloads can push uncomfortably close to 256 GB/s. A workload doesn&#8217;t necessarily have to get right up to bandwidth limits for memory-related performance issues to start cropping up. Under high bandwidth demand, requests can start to pile up in various queues and end-to-end memory latency can increase. GHPC and Ungine Valley fall into that category. 3DMark Time Spy Extreme would definitely be DRAM bandwidth bound without the memory side cache.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1IAO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1IAO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 424w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 848w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 1272w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1IAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png" width="1258" height="759" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:759,&quot;width&quot;:1258,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1IAO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 424w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 848w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 1272w, https://substackcdn.com/image/fetch/$s_!1IAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c604135-f6ac-4107-bb9b-0ec6275210ee_1258x759.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Picking an interval with maximum bandwidth demands at the CS gives an idea of how much bandwidth Strix Halo&#8217;s GPU can demand. Strix Halo has the same 2 MB of graphics L2 cache that AMD&#8217;s older 7000 series &#8220;Phoenix&#8221; mobile chip had, despite more than doubling GPU compute throughput. Unsurprisingly, the GPU can draw a lot of bandwidth across Infinity Fabric. 3DMark Time Spy again stands out. If AMD wanted to simply scale up DRAM bandwidth without spending die area on cache, they would need well over 335 GB/s from DRAM.</p><p>Curiously, Digital Foundry notes that Strix Halo has very similar graphics performance to the PS5. The PS5 uses a 256-bit, 14 GT/s GDDR6 setup that&#8217;s good for 448 GB/s of theoretical bandwidth, and doesn&#8217;t have a memory side cache like Strix Halo. 448 GB/s looks just about adequate for satisfying Time Spy Extreme&#8217;s bandwidth needs, if just barely. If Strix Halo didn&#8217;t need to work in power constrained mobile devices, and didn&#8217;t need high memory capacity for multitasking, perhaps AMD could have considered a GDDR6 setup. To bring things back around to Infinity Cache, it seems to do very well at that interval above. It captures approximately 73% of memory traffic going through Infinity Fabric, which is good even from a hitrate point of view.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yi5W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yi5W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 424w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 848w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 1272w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yi5W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png" width="1456" height="729" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:729,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yi5W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 424w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 848w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 1272w, https://substackcdn.com/image/fetch/$s_!yi5W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5319a316-0b50-475c-8e70-29c8fa1fe6c5_1520x761.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The two bar graphs above already hint at how bandwidth demands and cache hitrates vary across workloads. Those figures vary within a workload as well, though plotting all of the logged data would be excessive. Variation both across and within workloads makes it extremely difficult to summarize cache efficacy with a single hitrate figure. Plotting the percentage of traffic at the CS not observed at the UMCs as a proxy for hitrate further emphasizes that point.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wEdj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wEdj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 424w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 848w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 1272w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wEdj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png" width="1456" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wEdj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 424w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 848w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 1272w, https://substackcdn.com/image/fetch/$s_!wEdj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487baf64-b146-4177-b647-e92b33c1cef3_1515x766.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Resolution settings can impact cache hitrate as well. While I don&#8217;t have a direct look at hitrate figures, plotting with the data I do have suggests increasing resolution in the Ungine Valley benchmark tends to depress hitrate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wWlm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wWlm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 424w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 848w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 1272w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wWlm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png" width="1456" height="731" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wWlm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 424w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 848w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 1272w, https://substackcdn.com/image/fetch/$s_!wWlm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3206e377-786a-4394-8aec-eeaf272ca797_1591x799.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AMD presented a slide at Hot Chips 2021 with three lines for different resolutions across a set of different cache capacities. I obviously can&#8217;t test different cache capacities. But AMD&#8217;s slide does beg the question of how well a 32 MB cache can do at a wider range of resolutions, and whether bandwidth demands remain under control at resolutions higher than what AMD may have optimized the platform for.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1cea!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1cea!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 424w, https://substackcdn.com/image/fetch/$s_!1cea!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 848w, https://substackcdn.com/image/fetch/$s_!1cea!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 1272w, https://substackcdn.com/image/fetch/$s_!1cea!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1cea!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png" width="1278" height="719" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64463e94-ee70-402d-8866-247c71736042_1278x719.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:719,&quot;width&quot;:1278,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1cea!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 424w, https://substackcdn.com/image/fetch/$s_!1cea!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 848w, https://substackcdn.com/image/fetch/$s_!1cea!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 1272w, https://substackcdn.com/image/fetch/$s_!1cea!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64463e94-ee70-402d-8866-247c71736042_1278x719.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ungine Superposition and 3DMark workloads can both target a variety of resolutions without regard to monitor resolution. Logging data throughout benchmark runs at different resolutions shows the 32 MB cache providing reasonable &#8220;bandwidth amplification&#8221; at reasonable resolutions. At unreasonable resolutions, the cache is still able to do something. But it&#8217;s not as effective as it was at lower resolutions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3xES!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3xES!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 424w, https://substackcdn.com/image/fetch/$s_!3xES!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 848w, https://substackcdn.com/image/fetch/$s_!3xES!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 1272w, https://substackcdn.com/image/fetch/$s_!3xES!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3xES!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png" width="1354" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1354,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3xES!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 424w, https://substackcdn.com/image/fetch/$s_!3xES!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 848w, https://substackcdn.com/image/fetch/$s_!3xES!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 1272w, https://substackcdn.com/image/fetch/$s_!3xES!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc7e963-5b57-41e0-8581-3b0aa67a1c97_1354x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Plotting data over time shows spikes and dips in Infinity Cache efficacy occur at remarkably consistent times, even at different resolutions. 8K is an exception with the Superposition benchmark. The 8K line looks stretched out, possibly because Strix Halo&#8217;s GPU was only able to average a bit over 10 FPS, and time got slowed a bit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5GSJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5GSJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 424w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 848w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 1272w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5GSJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png" width="1456" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5GSJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 424w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 848w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 1272w, https://substackcdn.com/image/fetch/$s_!5GSJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cb1a153-8e7f-481e-8060-9aae6e95a99f_1600x715.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If Strix Halo&#8217;s iGPU were capable of delivering over 30 FPS in Superposition, AMD would definitely need a larger cache, more DRAM bandwidth, or a combination of both. Taking a simplistic view and tripling maximum observed 8K bandwidth demands would give a figure just north of 525 GB/s. But Strix Halo&#8217;s iGPU clearly wasn&#8217;t built with such a scenario in mind, and AMD&#8217;s selected combination of cache capacity and DRAM bandwidth works well enough at all tested resolutions. While high resolutions create the most DRAM traffic, 1080P shows the heaviest bandwidth demands at the Infinity Fabric level.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q1FQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 424w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 848w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 1272w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png" width="1360" height="753" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:753,&quot;width&quot;:1360,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 424w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 848w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 1272w, https://substackcdn.com/image/fetch/$s_!Q1FQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a35935a-a75a-4e3a-9148-1f5411cbff0e_1360x753.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3DMark Wild Life Extreme is a better example because it&#8217;s a lightweight benchmark designed to run on mobile devices. Strix Halo&#8217;s iGPU can average above 30 FPS even when rendering at 8K. Again, DRAM bandwidth demands increase and Infinity Cache becomes less effective as resolution goes up. But Infinity Cache still does enough to keep the chip well clear of DRAM bandwidth limits. Thus the cache does its job, and the bandwidth situation remains under control across a wide range of resolutions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ThiM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ThiM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 424w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 848w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 1272w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ThiM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png" width="1406" height="750" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:750,&quot;width&quot;:1406,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ThiM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 424w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 848w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 1272w, https://substackcdn.com/image/fetch/$s_!ThiM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7d2dd26-ddd4-4246-bfe4-7862813778f1_1406x750.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Bandwidth demands are more important than hitrates. Curiously though, Wild Life Extreme experiences increasingly severe hitrate dips at higher resolutions around the 16-19 and 45-52 second intervals. Those dips barely show at 1440P or below. Perhaps a 32 MB cache&#8217;s efficacy also shows more variation as resolution increases.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yKG7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yKG7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 424w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 848w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 1272w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yKG7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43df6977-f633-47de-83eb-5c57ef243782_1600x791.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yKG7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 424w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 848w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 1272w, https://substackcdn.com/image/fetch/$s_!yKG7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43df6977-f633-47de-83eb-5c57ef243782_1600x791.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>A Video?</h1><p>A few errors here - I said Core Coherent Master read/write beats were 32B and 64B/cycle. It&#8217;s not per cycle, it&#8217;s per data beat. And I meant to say reads outnumber writes at the end.</p><div id="youtube2-JACYz0YS_bc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;JACYz0YS_bc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/JACYz0YS_bc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h1>Final Words</h1><p>Chip designers have to tune their designs to perform well across a wide variety of workloads. Cache sizes are one parameter on the list. AMD chose to combine a 32 MB cache with 256 GB/s of DRAM bandwidth, and it seems to do well enough across the workloads I tested. Monitoring at the CS-es and UMCs also supports AMD&#8217;s data showing that higher resolutions tend to depress hitrate. Those results explain why larger GPUs tend to have larger caches and higher DRAM bandwidth.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jk45!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jk45!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 424w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 848w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 1272w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jk45!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png" width="877" height="320" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:320,&quot;width&quot;:877,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jk45!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 424w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 848w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 1272w, https://substackcdn.com/image/fetch/$s_!Jk45!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81c05631-9eea-47b7-b316-8146e03b3e0f_877x320.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At a higher level, GPU bandwidth demands have been a persistent challenge for large iGPUs and have driven a diverse set of solutions. Over a decade ago, Intel created &#8220;halo&#8221; iGPU variants and paired a 128 MB eDRAM cache while sticking with a typical 128-bit client memory bus. AMD&#8217;s console chips use large and fast DRAM buses. The PS5 is one example. Strix Halo does a bit of both. It combines modest cache capacity with a more DRAM bandwidth than typical client chips, but doesn&#8217;t rely as much on DRAM bandwidth as console chips.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FxEQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FxEQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 424w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 848w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 1272w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FxEQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png" width="1456" height="774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:774,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FxEQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 424w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 848w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 1272w, https://substackcdn.com/image/fetch/$s_!FxEQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32a3cead-075c-4b4e-b4e4-f7c89bbba388_1600x851.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Those varied approaches to the bandwidth problem are fascinating. Watching how Infinity Cache behaves in various graphics workloads has been fascinating as well. But everything would be so much more fun if AMD&#8217;s tools provided direct data on Infinity Cache hitrates. That applies to both integrated and discrete GPUs. Infinity Cache is a well established part of AMD&#8217;s strategy. It has been around for several generations, and now has a presence in AMD&#8217;s mobile lineup too. I suspect developers would love to see Infinity Cache hitrates in addition to the data on GPU-private caches that AMD&#8217;s current tools show.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[AMD’s Chiplet APU: An Overview of Strix Halo]]></title><description><![CDATA[Hello you fine Internet folks!]]></description><link>https://chipsandcheese.com/p/amds-chiplet-apu-an-overview-of-strix</link><guid isPermaLink="false">https://chipsandcheese.com/p/amds-chiplet-apu-an-overview-of-strix</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Sat, 18 Oct 2025 03:52:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Tqer!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks!</p><p>Today we are looking at AMD&#8217;s largest client APU to date, Strix Halo. This is an APU designed to be a true all-in-one mobile processor, able to handle high end CPU and GPU workloads without compromise. Offering a TDP range of 55W to 120W, the chip targets a far higher power envelope compared to standard Strix Point, but eschews the need for dedicated graphics.</p><p>To get y&#8217;all all caught up on the history and specifications of this APU, AMD first announced Strix Halo at CES 2025 earlier this year to much fanfare. Strix Halo is AMD&#8217;s first chiplet APU in the consumer market with AMD using Strix Halo as a bit of a show piece for what both CPU and GPU performance can look like with a sufficiently large APU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-bvE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-bvE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 424w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 848w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 1272w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-bvE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png" width="393" height="330" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:330,&quot;width&quot;:393,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-bvE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 424w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 848w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 1272w, https://substackcdn.com/image/fetch/$s_!-bvE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9060b1d-2fda-47a7-8bb8-2fd447f602d0_393x330.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AMD&#8217;s Strix Halo can be equipped with dual 8 core Zen 5 CCDs for a total of 16 cores that feature the same 512b FPU as the desktop parts. This is a change from the more mainstream and monolithic Strix Point APU which has &#8220;double-pumped&#8221; 256b FPUs similar to Zen 4 for use with AVX512 code. What is similar to the more mainstream Strix Point is the same 5.1GHz max boost clock which is a 600MHz deficit compared to the desktop flagship Zen 5 CPU, the Ryzen 9 9950X.</p><p>Moving to the 3rd die on a Strix Halo package, a RDNA 3.5 iGPU takes up the majority of the SoC die with 40 compute units, 32MB of Infinity Cache, and a boost clock of up to 2.9GHz placing raw compute capability somewhere between the RX 7600 XT and RX 7700.</p><p>To feed this chip, AMD has equipped Strix Halo with a 256b LPDDR5X-8000 memory bus, which provides up to 256GB/s shared between all of the components. This is slightly lower than the 288GB/s available to the RX 7600 XT but is much higher than any other APU we have tested.</p><h2>Acknowledgments</h2><p>A massive thank you to both <a href="https://rog.asus.com/laptops/rog-flow/rog-flow-z13-2025/">Asus</a> and <a href="https://www.hp.com/us-en/workstations/zbook-ultra.html">HP</a> for sending over a <a href="https://rog.asus.com/laptops/rog-flow/rog-flow-z13-2025/">ROG Flow Z13 (2025)</a> and a <a href="https://www.hp.com/us-en/workstations/zbook-ultra.html">ZBook Ultra G1a 14&#8221;</a> for testing which were both equipped with an <a href="https://www.amd.com/en/products/processors/laptop/ryzen/ai-300-series/amd-ryzen-ai-max-plus-395.html">AMD Ryzen AI Max+ 395</a>. All of the gaming tests were done on the <a href="https://rog.asus.com/laptops/rog-flow/rog-flow-z13-2025/">Flow Z13</a> due to that being a more gaming focused device and all of the microbenchmarking was done on the <a href="https://www.hp.com/us-en/workstations/zbook-ultra.html">ZBook Ultra G1a</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tqer!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tqer!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tqer!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png" width="1200" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tqer!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 424w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 848w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!Tqer!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F533709c7-ba55-444e-ae8f-045da6aef137_1200x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.hp.com/us-en/workstations/zbook-ultra.html">HP ZBook Ultra G1a 14&#8221;</a> which has become my daily go-to laptop for the past few months. </figcaption></figure></div><h2>Memory Subsystem from the CPU&#8217;s Perspective</h2><p>Starting with the memory latency from Zen 5&#8217;s perspective, we see that the latency difference between Strix Point and Strix Halo is negligible with Strix Point at ~128ns of memory latency and Strix Halo at ~123ns of memory latency. However, as you can see the CPU does not have access to the 32MB of Infinity Cache on the IO die. This behavior was confirmed by Mahesh Subramony <a href="https://www.youtube.com/watch?v=yR4BwzgwQns">during our interview about Strix Halo at CES 2025</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Owcs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Owcs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 424w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 848w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 1272w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Owcs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png" width="1456" height="616" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:616,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Owcs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 424w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 848w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 1272w, https://substackcdn.com/image/fetch/$s_!Owcs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba5e1c7e-1a60-4c52-b8e5-021baaeafd81_1600x677.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While the 123ns DRAM latency seen here is quite good for a mobile part, desktop processors like our 9950X fare much better at 75-80ns.</p><p>Moving on to memory bandwidth, we see Strix Halo fall into a category of its own of the SoCs we have tested.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!25fP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!25fP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 424w, https://substackcdn.com/image/fetch/$s_!25fP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 848w, https://substackcdn.com/image/fetch/$s_!25fP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 1272w, https://substackcdn.com/image/fetch/$s_!25fP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!25fP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png" width="1118" height="686" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:686,&quot;width&quot;:1118,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!25fP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 424w, https://substackcdn.com/image/fetch/$s_!25fP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 848w, https://substackcdn.com/image/fetch/$s_!25fP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 1272w, https://substackcdn.com/image/fetch/$s_!25fP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1815f48c-8358-42de-b3eb-afbfef1924be_1118x686.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When doing read-modify-add operations across both CCDs, the 16 Zen 5 cores can pull over 175GB/s of bandwidth from the memory with reads being no slouch at 124GB/s across both CCDs.</p><p>However, looking at the bandwidth of a single CCD and just like the desktop CPUs a single Strix-Halo CCD only has a 32 byte per cycle read link to the IO die. And just like the desktop chips, the chip to chip link runs at ~2000MHz, which caps out the single CCD read at 64GB/s. Unlike the desktop chips, the write link is 32 bytes per cycle and we are seeing about 43GB/s for the write bandwidth. That brings the total theoretical single CCD bandwidth to 128GB/s and the observed bandwidth is just over 103GB/s.</p><h2>CPU&#8217;s Performance</h2><p>The performance of Strix Halo&#8217;s CPU packs quite a bit more of a punch than Strix Point&#8217;s CPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z6Dv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 424w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 848w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 1272w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png" width="702" height="466" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:466,&quot;width&quot;:702,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 424w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 848w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 1272w, https://substackcdn.com/image/fetch/$s_!Z6Dv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69ab58b0-4d2e-4427-a4f7-8bd2ade7f177_702x466.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Strix Halo&#8217;s CPU can match a last generation desktop flagship CPU, the 7950X, in Integer performance despite a 11.7% clock speed delta. And nearly matches AMD current desktop flagship CPU, the 9950X, in Floating Point performance again with a 11.7% clock speed deficit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9TkE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9TkE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 424w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 848w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 1272w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9TkE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png" width="1197" height="1103" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1103,&quot;width&quot;:1197,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9TkE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 424w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 848w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 1272w, https://substackcdn.com/image/fetch/$s_!9TkE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F832bf5e9-8026-44f7-97f8-ac5b97836c35_1197x1103.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at the SPEC CPU 2017 Integer subtests and while Strix Halo can&#8217;t quite match the desktop 9950X, likely due to the higher memory latency of Strix Halo&#8217;s LPDDR5X bus, it does get close in a number of subtests.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HOnd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HOnd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 424w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 848w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HOnd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png" width="1444" height="1308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1308,&quot;width&quot;:1444,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HOnd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 424w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 848w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 1272w, https://substackcdn.com/image/fetch/$s_!HOnd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F198f15be-ca4e-478d-bb92-a3e1e854bb84_1444x1308.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the FP subtests and the story is similar to the Integer subtests but Strix Halo can get even closer to the 9950X and even beat it in the fotonik3d subtest.</p><h2>Memory from the GPU&#8217;s Perspective</h2><p>Moving to the GPU side of things and this is where Strix Halo really shines. The laptop we used as a comparison to Strix Halo was the HP Omen Transcend 14 2025 with a 5070M equipped which maxed out at about 75 Watts for the GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yuGY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yuGY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 424w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 848w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 1272w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yuGY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png" width="1063" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1063,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yuGY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 424w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 848w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 1272w, https://substackcdn.com/image/fetch/$s_!yuGY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5dd2993-0c6f-4e5b-bd1b-907abeacc74c_1063x626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Strix Halo has over double the memory bandwidth of any of the other mobile SoCs that we have tested. However, the RTX 5070 Mobile does have about 50% more memory bandwidth compared to Strix Halo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GL83!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GL83!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 424w, https://substackcdn.com/image/fetch/$s_!GL83!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 848w, https://substackcdn.com/image/fetch/$s_!GL83!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 1272w, https://substackcdn.com/image/fetch/$s_!GL83!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GL83!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png" width="1456" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GL83!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 424w, https://substackcdn.com/image/fetch/$s_!GL83!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 848w, https://substackcdn.com/image/fetch/$s_!GL83!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 1272w, https://substackcdn.com/image/fetch/$s_!GL83!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0605e91d-2326-4e93-a6c5-d2ee5d093fee_1600x898.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at the caches of Strix Halo, the Infinity Cache, AKA MALL, is able to deliver over 40% higher bandwidth compared to the 5070M&#8217;s L2 while having 33% more capacity. Plus Strix Halo has a 4MB L2 which is capable of providing 2.5TB/s of bandwidth to the GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LHJo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LHJo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 424w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 848w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 1272w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LHJo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png" width="1456" height="816" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:816,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LHJo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 424w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 848w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 1272w, https://substackcdn.com/image/fetch/$s_!LHJo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f51b0-7857-42d4-8980-eefdef474d8a_1600x897.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the latency, the more complex cache layout of Strix Halo does give it a latency advantage after the 128KB with Strix Halo&#8217;s L2 being significantly lower latency than the 5070M&#8217;s L2 and the larger 32MB MALL that Strix Halo has a similar latency to the 5070M&#8217;s L2. And Strix Halo&#8217;s memory latency is about 35% lower than the 5070M&#8217;s memory latency.</p><h2>The GPU&#8217;s Compute Throughput</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_JSa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_JSa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 424w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 848w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 1272w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_JSa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png" width="1456" height="1269" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1269,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_JSa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 424w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 848w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 1272w, https://substackcdn.com/image/fetch/$s_!_JSa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1657e5b-6b0a-4c0e-bb38-2c76bd9eb782_1600x1395.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at the floating point throughput, we see that Strix Halo unsurprisingly has about 2.5x the throughput of Strix Point considering it has about two and a half times the number of Compute Units. Strix Halo oftentimes can match or even pull ahead of the 5070 Mobile in terms of throughput. I will note that the FP16 results for the 5070 Mobile are half of what I would expect; the FP16:FP32 ratio for the 5070 Mobile should be 1:1 so I am not positive about what is going on there.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S0IM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S0IM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 424w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 848w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 1272w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S0IM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png" width="1217" height="1017" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1017,&quot;width&quot;:1217,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82845,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/176468031?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S0IM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 424w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 848w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 1272w, https://substackcdn.com/image/fetch/$s_!S0IM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9847d87a-7616-4060-affc-34b4358d07cb_1217x1017.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the integer throughput and we see the 5070 Mobile soundly pulling ahead of the Radeon 8060S.</p><h2>GPU Performance</h2><p>Looking at the GPU performance, we see Strix Halo once again shine, with a staggering level of performance available for an iGPU, courtesy of the large CU count paired with relatively high memory bandwidth. Our comparison suite includes several recent iGPU&#8217;s from Intel/AMD, along with the newest generation RTX 5070 Mobile @ 75W to act as a reference for mid to high-range laptop dedicated graphics, and the antiquated GTX 1050 as a reference for budget dedicated graphics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5BSx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5BSx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 424w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 848w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 1272w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5BSx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png" width="1209" height="737" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:737,&quot;width&quot;:1209,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5BSx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 424w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 848w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 1272w, https://substackcdn.com/image/fetch/$s_!5BSx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc604818-3f17-4001-9812-c25b9ac6acd2_1209x737.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at Fluid X3D for a compute-heavy workload, we can see the Radeon 8060S absolutely obliterates the other iGPU&#8217;s from Intel/AMD, putting itself firmly in a class above. The 5070 is no slouch though, and still holds a substantial 64.1% lead largely due to the higher memory bandwidth afforded to the 5070M.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A1AB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A1AB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 424w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 848w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 1272w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A1AB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png" width="1090" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1090,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/176468031?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A1AB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 424w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 848w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 1272w, https://substackcdn.com/image/fetch/$s_!A1AB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4c7364d-3abd-48c1-b000-8b8ba1df9728_1090x626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Switching to gaming workloads with Cyberpunk 2077, we start with a benchmark conducted while on battery power. The gap with other iGPU&#8217;s is still wide, but now the 5070M is limited to 55W and exhibits 7.5% worse performance at 1080p low settings when compared to the Radeon 8060S.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G3FB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G3FB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 424w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 848w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 1272w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G3FB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png" width="1090" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1090,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/176468031?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G3FB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 424w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 848w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 1272w, https://substackcdn.com/image/fetch/$s_!G3FB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55f044e0-25a4-4a2f-9976-73b738ac08dc_1090x626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Finally, moving to wall power and allowing both the Radeon 8060S and 5070M to access the full power limit in CP2077, we can see that the 8060S still pulls ahead at 1080p low by 2.5%, while at 1440p medium we see a reversal, with the 5070M commanding an 8.3% lead. Overall the two provide a comparable experience in Cyberpunk 2077, with changes in settings or power limits adjusting the lead between the two. This is a seriously impressive turnaround for an iGPU working against dedicated graphics, and demonstrates the versatility of the chip in workloads like gaming, where iGPU&#8217;s have traditionally struggled.</p><h2>Conclusion</h2><p>Strix Halo follows in the footsteps of many other companies in the goal of designing an SoC for desktop and laptop usage that is truly all encompassing. The CPU and GPU performance is truly a class above standard low power laptop chips, and is even able to compete with larger systems boasting dedicated graphics. CPU performance is especially impressive with a comparable showing to the desktop Zen 5 CPUs. GPU performance is comparable to mid range dedicated graphics, while still offering the efficiency and integration of an iGPU. High end dedicated graphics still have a place above Strix Halo, but the versatility of this design for smaller form factor devices is class leading.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SrML!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SrML!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 424w, https://substackcdn.com/image/fetch/$s_!SrML!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 848w, https://substackcdn.com/image/fetch/$s_!SrML!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!SrML!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SrML!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png" width="1456" height="917" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SrML!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 424w, https://substackcdn.com/image/fetch/$s_!SrML!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 848w, https://substackcdn.com/image/fetch/$s_!SrML!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!SrML!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3e47456-dd28-42b7-90b6-4d2415364c5e_1600x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://rog.asus.com/laptops/rog-flow/rog-flow-z13-2025/">Asus ROG Flow Z13 2025</a> which is a very similar form factor to the Microsoft Surface but has a Strix Halo inside of it instead of a lower power Intel chip. Image credit goes to Chester.</figcaption></figure></div><p>However, this is not to say that Strix Halo is perfect. I was hoping to have a section dedicated to the ML performance of Strix Halo in this article, however AMD only just released preview support for Strix Halo in the ROCm 7.0.2 release which came out about a week ago from time of publication. As a result of the long delay between the launch of Strix Halo and the release of ROCm 7.0.2, the ML performance will have to wait until a future article.</p><p>However, putting aside ROCm, Strix Halo is a very, very cool piece of technology and I would love to see successors to Strix Halo with newer CPU and GPU IP and possibly even larger memory buses similar to Apple&#8217;s Max and Ultra series of SoCs with 512b and 1024b memory respectively. AMD has a formula for building bigger APUs with Strix Halo, which opens the door to a lot of interesting hardware possibilities in the future.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Panther Lake’s Reveal at ITT 2025]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/panther-lakes-reveal-at-itt-2025</link><guid isPermaLink="false">https://chipsandcheese.com/p/panther-lakes-reveal-at-itt-2025</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Tue, 14 Oct 2025 17:01:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/BHssAMUZzew" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks,</p><p>At this year&#8217;s ITT, Intel announced their upcoming client SoC, Panther Lake (PTL) along with all of the improvements that they have made to this latest SoC. With a new node, improvements to the CPU cores, a brand new GPU, and more, PTL is an evolution of Intel&#8217;s prior SoC designs. So without further ado, let&#8217;s dive in.<br><br>Hope y&#8217;all enjoy!</p><div id="youtube2-BHssAMUZzew" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;BHssAMUZzew&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/BHssAMUZzew?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>SoC Layout</h2><p>Starting with the SoC layout of Panther Lake, there are 3 active tiles on a Panther Lake CPU: Compute tile, GPU tile, and the Platform Controller tile.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZdNw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZdNw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 424w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 848w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 1272w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZdNw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png" width="625" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d847f560-b347-43f5-806b-dde08a9c9af1_625x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZdNw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 424w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 848w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 1272w, https://substackcdn.com/image/fetch/$s_!ZdNw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd847f560-b347-43f5-806b-dde08a9c9af1_625x350.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with the Compute tile, the Compute tile has the main CPU complex, the Low Power CPU complex, the ISP, the NPU, the Media engines, and the 8MB Memory Side Cache alongside the 128b memory bus. The GPU tile only has the GPU compute on board and nothing else. The Platform Controller tile houses all of the I/O IP such as the PCIe, Thunderbolt, USB, etc.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cQlb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cQlb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cQlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cQlb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!cQlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d8f9430-5f56-4e93-8522-0b43c5290b73_625x351.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with the smallest Panther Lake configuration, we have 4 P-Cores and 4 LP E-Cores paired with 8MB Memory-side Cache on the Compute Tile which is fabricated on Intel&#8217;s newest node 18A. That Compute Tile is paired with a 4 core Xe3 iGPU fabricated on Intel&#8217;s 3 process which makes this the first Intel Arc product fabricated at Intel. The Platform Controller Tile has 12 PCIe lanes split between 4 Gen 5 lanes and 8 Gen 4 lanes along with 4 Thunderbolt 4 controllers, 2 USB 3.2 &amp; 8 USB 2.0 ports, and Wi-Fi 7 and Bluetooth 6.0.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UVJO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UVJO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UVJO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14282810-7504-41fe-b096-ccbb712de44c_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UVJO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!UVJO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14282810-7504-41fe-b096-ccbb712de44c_625x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the middle configuration of Panther Lake, there has been a large change in both the Compute Tile and the Platform Controller Tile. The Compute Tile now has 4 P-Cores and 8 E-Cores which share a 18MB L3 cache with an extra 4 LP E-Cores, all of which can access the 8MB memory-side cache on board the Compute tile.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7hob!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7hob!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 424w, https://substackcdn.com/image/fetch/$s_!7hob!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 848w, https://substackcdn.com/image/fetch/$s_!7hob!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 1272w, https://substackcdn.com/image/fetch/$s_!7hob!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7hob!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png" width="625" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7hob!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 424w, https://substackcdn.com/image/fetch/$s_!7hob!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 848w, https://substackcdn.com/image/fetch/$s_!7hob!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 1272w, https://substackcdn.com/image/fetch/$s_!7hob!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90fa19c-127a-400d-8f8b-154a6c0c6695_625x350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Platform Controller tile now has an extra 8 Gen 5 PCIe lanes for a total of 20 PCIe lanes on the SoC. It was not made clear if all versions of Panther Lake use the same Platform Controller Tile fabricated on TSMC N6 or if there are 2 versions of this tile but it seems likely that all Panther Lake SoCs reuse the same physical die but with different configurations enabled.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JOoH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JOoH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JOoH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JOoH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!JOoH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ebe868-2b7f-47fc-8f6a-96c873668d99_625x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the largest configuration of Panther Lake which keeps the same 4 P-Core, 8 E-Core, and 4 LP E-Core configuration as the middle configuration and drops the I/O configuration to the same as the smallest Panther Lake, but in exchange gains a 12 Xe3 core GPU tile which is fabricated on TSMC&#8217;s N3E process. This is the largest iGPU that Intel has made for a client SoC.</p><h2>Xe3 Architecture</h2><p>For many people the most exciting part of Panther Lake is going to be the brand new Xe3 architecture found in Panther Lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Xsz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Xsz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Xsz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png" width="625" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Xsz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!5Xsz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f04e08c-788c-4e5e-9c71-b179c6c8c881_625x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>However, this new architecture will not be marketed as Intel Arc C-Series but instead will be marketed under Intel Arc B-Series just like the Battlemage series of dGPUs e.g. the B580 and B570. This is quite perplexing due to the new architecture yet the old branding which even Tom Petersen said was &#8220;confusing&#8221;.</p><p>Moving on from the strange marketing decisions, one of the big changes in Xe3 is making the render slice capable of handling up to 6 Xe Cores from the 4 Xe Cores that the Xe2 render slice could support.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T04R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T04R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 424w, https://substackcdn.com/image/fetch/$s_!T04R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 848w, https://substackcdn.com/image/fetch/$s_!T04R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 1272w, https://substackcdn.com/image/fetch/$s_!T04R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T04R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png" width="625" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T04R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 424w, https://substackcdn.com/image/fetch/$s_!T04R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 848w, https://substackcdn.com/image/fetch/$s_!T04R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 1272w, https://substackcdn.com/image/fetch/$s_!T04R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec40abe5-fa3b-460d-8369-11f4577ed492_625x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The ability of the render slice to accommodate up to 6 Xe Cores allows for the 12 Xe core configuration that is found in the largest version of Panther Lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o3hj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o3hj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 424w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 848w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 1272w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o3hj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png" width="625" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o3hj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 424w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 848w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 1272w, https://substackcdn.com/image/fetch/$s_!o3hj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4520d256-2ad2-4bc3-8535-6962a13612ea_625x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The 16MB L2 found in the 12Xe GPU reduces the traffic on the SoC fabric by up to 36% depending on the workload in question compared to a 8MB L2.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L0a-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L0a-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L0a-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2244305e-a942-4979-809d-7ab265c08c7b_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L0a-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!L0a-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2244305e-a942-4979-809d-7ab265c08c7b_625x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Going into the Xe Core, the amount of compute throughput that Xe3 has not changed compared to Xe2 with the same 8 512 bit Vector Engines and 8 2048 bit Matrix Engines per Xe Core but the L1/SLM capacity has increased from 192KB to 256KB.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DiNc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DiNc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 424w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 848w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 1272w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DiNc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png" width="625" height="354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/122fec25-9c88-455b-8f57-199b2693137a_625x354.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DiNc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 424w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 848w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 1272w, https://substackcdn.com/image/fetch/$s_!DiNc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122fec25-9c88-455b-8f57-199b2693137a_625x354.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And while Intel hasn&#8217;t increased the compute throughput of Xe3, they have made some large improvements to the frontend capabilities of Xe3. A XVE can now track up to 10 threads up from the 8 threads in Xe2&#8217;s XVE along with the single largest improvement being what Intel is calling &#8220;Variable Register Allocation&#8221;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wesx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wesx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 424w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 848w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 1272w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wesx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png" width="624" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:624,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wesx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 424w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 848w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 1272w, https://substackcdn.com/image/fetch/$s_!Wesx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F245ba18e-d80b-4917-9de8-f3a74c211120_624x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In Xe1 and prior Intel GPUs, a thread would be allocated 128 registers regardless of if that thread needed all 128 registers or not. In Xe2, Intel added the &#8220;Large GRF&#8221; mode which allows a thread to pick between 128 registers or 256 registers. Xe3 now allows a thread to allocate up to 256 registers in 32 register increments.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hdWR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hdWR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 424w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 848w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 1272w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hdWR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png" width="625" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hdWR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 424w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 848w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 1272w, https://substackcdn.com/image/fetch/$s_!hdWR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9020c-e55f-43d3-afbb-3ed8d5e110d6_625x378.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So while Xe3 kept the same 64KB per XVE register file as Xe2, the register file can now be better utilized thanks to more granularity in register allocation.</p><p>All of this, along with ray tracing units now capable of dynamic ray management, adds up to an approximate 50% performance uplift for the 12 Xe3 core iGPU in comparison to Lunar Lake&#8217;s iGPU in 3D Mark Solar Bay, Cyberpunk 2077, and Borderlands 3 albeit at higher power.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cIV0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cIV0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cIV0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cIV0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!cIV0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d48e6f-edee-43a7-a645-a82649d9c723_625x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Cougar Cove and Darkmont</h2><p>Going to the CPU cores found in Panther Lake, and they are much more evolution rather than revolution. If you&#8217;d like a deeper dive into the CPU cores found in Panther Cove, <a href="https://chipsandcheese.com/p/interviewing-intels-chief-architect">we did an interview with the Chief Architect of x86 Cores at Intel, Stephen Robinson, at ITT</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rDNA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rDNA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 424w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 848w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 1272w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rDNA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png" width="625" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rDNA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 424w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 848w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 1272w, https://substackcdn.com/image/fetch/$s_!rDNA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cbd64a1-8dd3-4db1-9476-19927b8352c8_625x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting with the P-Cores, codenamed Cougar Cove, Intel had increased the TLB capacity by 50% compared to Lion Cove along with improving the branch predictor by increasing some of the structure sizes in the BPU along with porting over some of the novel BPU algorithms that Intel tested in Lunar Lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Okdc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Okdc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Okdc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png" width="625" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Okdc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 424w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 848w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 1272w, https://substackcdn.com/image/fetch/$s_!Okdc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85c13ff3-3589-427c-a112-45cf10e7da37_625x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Moving to the E-Core found in Panther Lake, codenamed Darkmont, the major improvements to Darkmont are also found in the BPU by increasing structure sizes, improving the prefetcher, and <a href="https://chipsandcheese.com/p/intel-details-skymont">improved nanocode behavior</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yMmP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yMmP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yMmP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png" width="625" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b499a12-0324-412f-824a-8472c569f146_625x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yMmP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!yMmP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b499a12-0324-412f-824a-8472c569f146_625x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Interestingly, both Cougar Cove and Darkmont have improved the memory disambiguation (Store to Load) behavior but they each use different techniques for the improved memory disambiguation behavior. This is a case of having 2 different teams trying out different things and coming to a similar solution with different implementations.</p><p>Putting all of the microarchitecture improvements and cache improvements together and Intel is claiming a 10% single threaded improvement compared to both Arrow Lake and Lunar Lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wOZf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wOZf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wOZf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png" width="625" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:625,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wOZf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 424w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 848w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 1272w, https://substackcdn.com/image/fetch/$s_!wOZf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd80bc071-1e2f-4f46-af88-d7addf120e44_625x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Intel is also claiming a 40% performance per watt improvement over Arrow Lake.</p><p>Moving to multi-thread performance, Intel is claiming a 50% performance improvement over Lunar Lake and a 30% decrease in power consumption at similar multi-thread performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cw9X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cw9X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 424w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 848w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 1272w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cw9X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png" width="624" height="348" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:348,&quot;width&quot;:624,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cw9X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 424w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 848w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 1272w, https://substackcdn.com/image/fetch/$s_!Cw9X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74c6602-784e-4ae6-a266-f945a82d7f46_624x348.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>NPU and Media Engine</h2><p>Panther Lake also comes with some improvements to other parts of the SoC such as the Media Engine now supporting AVC 10b decode and encode.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jOMZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jOMZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 424w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 848w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 1272w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jOMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png" width="624" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:624,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jOMZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 424w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 848w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 1272w, https://substackcdn.com/image/fetch/$s_!jOMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c84d80c-73d0-4318-bd49-9346cecc58d1_624x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The NPU also got an upgrade with FP8 support along with shrinking the NPU by about 40% compared to the NPU found in Lunar Lake.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wg7s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wg7s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 424w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 848w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 1272w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wg7s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png" width="624" height="359" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:359,&quot;width&quot;:624,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wg7s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 424w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 848w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 1272w, https://substackcdn.com/image/fetch/$s_!wg7s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F032856b8-0b8c-4174-90fe-1a83de9525af_624x359.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Final Words</h2><p>Intel&#8217;s upcoming Panther Lake series of CPUs is a merging of the Lunar Lake and Arrow Lake families of SoCs in terms of the SoC design. By moving the GPU on to its own tile, Intel can now start to fabricate Arc class iGPUs on Intel&#8217;s own nodes instead of solely fabbing them on TSMC&#8217;s nodes. Perhaps this could mean that Intel could fabricate future dGPUs on Intel&#8217;s own nodes.</p><p>As a bit of a sidenote, Intel announced that they would be announcing an inference focused GPU at OCP 2025. Now it is unknown what node this chip is fabricated on, but now we know it is based on the Xe3P microarchitecture, has up to 160GB of LPDDR5X, and will be sampling in the second half of 2026 which hopefully means that we could be hearing more about Celestial dGPUs for consumers by the end of next year.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wfDI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wfDI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 424w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 848w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 1272w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wfDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png" width="624" height="188" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65951428-dba3-447e-984e-bae6712b4722_624x188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:188,&quot;width&quot;:624,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wfDI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 424w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 848w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 1272w, https://substackcdn.com/image/fetch/$s_!wfDI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65951428-dba3-447e-984e-bae6712b4722_624x188.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Moving back to Panther Lake, the official launch of Panther Lake and all of the different SKUs will occur at CES 2026 with review samples hopefully in our hands by the end of January and units on shelves shortly after. The numbers that Intel put forward for Panther Lake&#8217;s performance are promising but personally my optimism for Panther Lake has been slightly dimmed by the lackluster graphs that Intel were showing for the improvements found in Panther Lake.</p><p>Regardless, I can&#8217;t wait to test a laptop powered by a 12 Xe3 core iGPU and see what performance it can put up compared to other large iGPU client devices.</p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item><item><title><![CDATA[Interviewing Intel's Chief Architect of x86 Cores at Intel Tech Tour 2025]]></title><description><![CDATA[Hello you fine Internet folks,]]></description><link>https://chipsandcheese.com/p/interviewing-intels-chief-architect</link><guid isPermaLink="false">https://chipsandcheese.com/p/interviewing-intels-chief-architect</guid><dc:creator><![CDATA[George Cozma]]></dc:creator><pubDate>Thu, 09 Oct 2025 15:58:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/62oerVCYK-s" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello you fine Internet folks, </p><p>I was at Intel Tech Tour this year where Intel talked about their upcoming Panther Lake and Clearwater Forest CPUs. I had a chance to sit down with Stephen Robinson, Lead Architect for x86 Cores at Intel, and talk about lntel&#8217;s approach to Cougar Cove and Darkmont and what changed from the prior generations.</p><div id="youtube2-62oerVCYK-s" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;62oerVCYK-s&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/62oerVCYK-s?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Hope y&#8217;all enjoy!</p><p>The transcript has been edited for readability and conciseness.</p><p><strong>George Cozma</strong>: Hello you fine Internet folks! We&#8217;re here in Phoenix, Arizona at Intel Tech Tour 2025, where there&#8217;s been a number of disclosures about Panther Lake and Clearwater Forest. And joining me, to talk about the core architectures in both of those CPUs, is Stephen Robinson. What do you do at Intel?</p><p><strong>Stephen Robinson</strong>: I am a CPU architect and I lead the architecture team for the x86 cores.</p><p><strong>George Cozma</strong>: Awesome. So, diving straight in: so we did do an interview about- a recorded interview about Skymont. But going back to Skymont, what were the big changes from your previous architecture, Crestmont, moving into Skymont?</p><p><strong>Stephen Robinson</strong>: Yeah, so Skymont, we did a lot. We wanted to build a wider, deeper machine so that we could sort of run more workloads. So kind of, &#8220;coverage&#8221; is one of the terms we use sometimes. If we can get more workloads running on an E core, then we can bring more efficiency to the whole platform.</p><p>So, you know, sometimes people want- &#8220;Why are you adding IPC to an E core? You&#8217;re making it more expensive, right?&#8221; Well, actually, software runs better. So we made the out-of-order depth about 50% bigger, somewhere around that ballpark. We went from two load ports to three. We roughly doubled the vector hardware. So we had two FMAs in Crestmont. Now we have four FMAs in Skymont. And then the front end, we went from kind of a six-wide, two cluster, three decode front end to a nine wide, three cluster. And then eight wide alloc and, you know, more branch prediction, a little bit more L2 bandwidth, the whole lot.</p><p><strong>George Cozma</strong>: So sort of an interesting quirk that I noticed about Skymont is that it has four store ports and three load ports. Why the four store ports? Usually you see more load ports than store ports. Why more store ports in this case?</p><p><strong>Stephen Robinson</strong>: Yeah. So let&#8217;s, let&#8217;s break it down into address generation versus execution. So, when you have three load execution ports, you need three load address generators. And so that&#8217;s there. On the store side, we have four store address generation units. But we only sustain two stores into the data cache.</p><p>So we have a little bit of asymmetry on the store side. So you&#8217;re right. Why on earth do we have more store address units than store ports? The answer is, we have hazards between loads and stores. And sometimes loads get blocked on stores because we don&#8217;t know the store address because we&#8217;re all out of order. So by increasing the store address bandwidth, that reduces the latency to resolving unknown stores.</p><p>So basically we get performance by just spending more time and effort generating store addresses so that loads don&#8217;t end up blocking.</p><p><strong>George Cozma</strong>: Awesome. So I know in Darkmont, something that caught my eye was the memory disambiguation. And that usually has to deal with store to load forwarding. But that&#8217;s been a technique that&#8217;s been around for quite a while, so what have you enhanced in Darkmont that you would like to talk about?</p><p><strong>Stephen Robinson</strong>: Sure, yeah. So you&#8217;re right. It&#8217;s about store load connections. And so in... there&#8217;s several different ways people do it. You can have a big, big table that tells you when it&#8217;s safe to ignore stores. You can have a small table that tells you when it&#8217;s unsafe to ignore stores.</p><p>So you can kind of do it either way. And those two techniques end up kind of collapsing to the same answer, because the big table saturates and everything&#8217;s safe. And then so, you know, you discover the hazard. What we&#8217;ve done here is we&#8217;ve spent a little bit more time trying to have hardware that isn&#8217;t just sort of a history table, that actually figures out, before address generation, whether things are going to be connected.</p><p>So when we bring uops to the memory subsystem to address generation, we kind of look at some information and say, &#8220;Oh, I&#8217;m fairly confident that these loads and stores are connected.&#8221; So I&#8217;m not using a table, I&#8217;m using sort of the inherent information about the instructions, whether I think they&#8217;re going to be connected. And that gives us the ability to kind of slow down the load or something like that so that, you know, we know when the store&#8217;s gone now it should be safer to do the load that we think might be connected.</p><p><strong>George Cozma</strong>: Interesting. Now, I know on Cougar Cove, which is the P core in Panther Lake, that you also commented on the memory disambiguation. Is that similar in Cougar Cove?</p><p><strong>Stephen Robinson</strong>: It&#8217;s similar in kind of concept, but it&#8217;s a bit different in implementation. So it&#8217;s a different method, but in the end, we&#8217;re still getting to where we&#8217;re trying to figure out when&#8217;s a good time to schedule the load relative to the store that we&#8217;re going to be dependent on. So, you know, I tell a story of two tables. Well, this is kind of another table. And again, we&#8217;re trying to say, &#8220;Okay, now I think it&#8217;s time to do the load because I think it&#8217;s going to be connected to the store.&#8221; So similar concept, different implementation.</p><p><strong>George Cozma</strong>: Okay. And I guess sort of what drives the two different implementations? Like the reasoning behind the two different implementations, I should say.</p><p><strong>Stephen Robinson</strong>: I would say it&#8217;s as simple as two teams working in parallel, doing independent research, solving localized problems, coming up with solutions. And then we end up with two similar but different implementations.</p><p><strong>George Cozma</strong>: Okay.</p><p><strong>Stephen Robinson</strong>: Two teams.</p><p><strong>George Cozma</strong>: Cool. So sort of talking about Cougar Cove, a key change made in Lion Cove was the lack of SMT. SMT in Lunar Lake and Arrow Lake is no longer there. Why wasn&#8217;t it -- could you have re-added it to Cougar Cove if you had wished? And why haven&#8217;t you added it back? Like what would be the reason why you wouldn&#8217;t?</p><p><strong>Stephen Robinson</strong>: Yeah. So let&#8217;s talk about client first, right, where this is where we&#8217;ve shipped products without SMT. When you have hybrid compute, SMT isn&#8217;t necessarily as valuable, right? So when you schedule something, you know, if you want performance, you schedule it on a P core. And then you schedule it on an E core. And then, once you&#8217;ve exhausted those, then you would come back and schedule a thread.</p><p>So in Alder Lake, Raptor Lake, that&#8217;s kind of how it works. So those are the threads on top of the dessert, right? In Lion Cove, in Lunar Lake and Arrow Lake, you know, we removed threads. We didn&#8217;t have threads implemented. Let me say it that way. And so that gave us -- we didn&#8217;t lose a lot in client because of hybrid and the core count. But we gained a bit in our design execution, so a little bit lower power because you don&#8217;t have the transistors and the logic to support SMT. A little bit smaller area because -- same reason. And it&#8217;s a little bit easier to achieve your frequency target. Because, you know, the old joke that SMT is a bit in the mux [multiplexer], right? So there&#8217;s truth to that. There&#8217;s a mux somewhere. And that causes delay. So now you&#8217;ve kind of got something that&#8217;s maybe a little bit easier and less expensive and maybe can go a little bit faster.</p><p>So when you&#8217;re doing Cougar Cove, you just take those basic premises and say, yeah, this is what I&#8217;m going to do for the next gen as well.</p><p><strong>George Cozma</strong>: And so on server, I know that there have been some data points that suggest that SMT does help. So what is sort of your opinion there?</p><p><strong>Stephen Robinson</strong>: Yeah. So server is a little bit different than client. You know, people have talked about doing hybrid compute in servers. But nobody does it. And the simple explanation is if you want to be hybrid in servers, you do it at the rack-level, not inside an SOC. Why would I want asymmetry inside my SOC when I can have asymmetry, you know, a 200 core server, another 200 core server, and I&#8217;ve got a bunch of those. So you have the choice. You know, Amazon and others, they have different instances that you can go and you can get. So what&#8217;s the value of different instances within one?</p><p>So first, there&#8217;s no hybrid in servers today in general. The second thing is, you know, kind of the story I told about you&#8217;d schedule on P cores and E cores and come back with threads. Well, if you don&#8217;t have the E cores, then you&#8217;re going to go on threads. Server workloads and gaming workloads and others, right, they miss a lot. They can have long latency. And so when you miss and you have long latency, you&#8217;ve got available hardware. So in the server area, threads, there are more workloads that like it. You know, take a networking workload. Those usually like threads because they&#8217;re moving a lot of data around and they&#8217;re exposing those latencies. So the server workloads are a bit different. And without hybrid, then SMT has more value.</p><p><strong>George Cozma</strong>: And so actually speaking on difference between client and server, so Darkmont is both used in Panther Lake and in Clearwater Forest. What sort of differences do you have to make in a core between server and client in terms of stuff like RAS features? So what differences are there in terms of implementation and what you have to design.</p><p><strong>Stephen Robinson</strong>: Yeah. Great question. So in the client space, you can have RAS features, but they don&#8217;t quite have as much value because the client system is different, right? If I have hundreds of cores or thousands of cores, reliability becomes very, very important.</p><p>If I&#8217;m on my own little laptop and I have fewer, it&#8217;s a different concern, right? When Google Cloud goes down, everyone&#8217;s very upset.</p><p><strong>George Cozma</strong>: Everything goes down.</p><p><strong>Stephen Robinson</strong>: That&#8217;s right. Everyone&#8217;s very upset. So clearly, you know, the bar of the reliability is there. So in a core, if we want a target server, there are additional features we&#8217;ll do. You know, ECC in the caches... so inside the core, we do add features. We can put that core in both if we want, right? So there aren&#8217;t a lot of physical differences between the cores in the two, but the environment is very different. So on the server side, maybe we have power gates per core, maybe we don&#8217;t. The power delivery is different. Because the power delivery is different, you may change the decision on when to power gate and when not to. And the power level is different. So maybe power gating isn&#8217;t as important in server because, you know, 24/7, I&#8217;m always running.</p><p>The other thing is there are things that can only really truly work at the SOC level because you need SOC components to be part of that. You know, take technology like SGX or TDX, you know, security. You know, secure computing elements. If you don&#8217;t have the security and the controllers in your client part, then even if you implemented it inside the core, it doesn&#8217;t matter because you need that whole system to do it. So there&#8217;s a lot of things that maybe it&#8217;s in the core, but you can really only test it and run it and productize it with the complete stack.</p><p><strong>George Cozma</strong>: And speaking of sort of the differences between client and server, I know in Lunar Lake, you talked a lot about how there was some novel branch prediction stuff going on. Do you see that being helpful in server workloads or was that -- were those sort of improvements more targeted towards client?</p><p><strong>Stephen Robinson</strong>: Everyone wants branch prediction. Honestly, everyone does. So in client, you know, it&#8217;s funny. Games. Are games similar to web servers?</p><p><strong>George Cozma</strong>: Not really.</p><p><strong>Stephen Robinson</strong>: Not really, right. But in terms of code footprints and paths and sizes, they&#8217;re more similar than you realize. Same kind of thing for databases. Databases are very large binaries.</p><p><strong>George Cozma</strong>: Databases are actually close -- are very similar to games in their sort of what they like in a core.</p><p><strong>Stephen Robinson</strong>: Exactly, right. So honestly, when it comes to branch prediction, we do it for everybody, right? We do it for client, we do it for server. And the things we do will be workload-specific sometimes in where you get the gains. But there&#8217;s always a workload in both client and server that will appreciate what you did.</p><p><strong>George Cozma</strong>: So sort of evolving on that, is it such that potentially you could make a branch predictor that is more targeted for server workloads and/or client workloads? Or is it such that there isn&#8217;t really a difference there, so to speak?</p><p><strong>Stephen Robinson</strong>: I would say that I think the -- internally, within Intel, we tend to think that server wants more branch prediction, larger capacity, right? Bigger paths. Because we know that the workloads are complex in -- and large binaries in server. But it really is in client as well, right? You know, just -- which workloads are you working at, right?</p><p><strong>George Cozma</strong>: Exactly.</p><p><strong>Stephen Robinson</strong>: You know. SPEC, okay, that&#8217;s different, obviously, right? But again, games and databases, yeah, they&#8217;re --</p><p>George: I would argue games and databases are closer to each other than SPEC is to either, in most cases.</p><p><strong>Stephen Robinson</strong>: That can be true.</p><p><strong>George Cozma</strong>: But of course, my final question here is, what&#8217;s your favorite type of cheese?</p><p><strong>Stephen Robinson</strong>: Oof. I like a good smoked Gouda. But honestly, we&#8217;re doing blue cheese, Roquefort type things these days. Because, you know, a little musky.</p><p><strong>George Cozma</strong>: I will admit, blue cheese is not my favorite. I had a really good cheddar from Washington. And, yeah, that was actually really good. It was a smoked cheddar. Which I&#8217;m not usually the biggest fan of.</p><p><strong>Stephen Robinson</strong>: I do like the smoked cheeses. I really do.</p><p><strong>George Cozma</strong>: Well, thank you so much.</p><p><strong>Stephen Robinson</strong>: Of course.</p><p><strong>George Cozma</strong>: So thank you so much for watching. If you like interviews like this, hit like, hit subscribe. Unfortunately I do have to say all that because it does help with the algorithm. And go check out the Substack where there will be a written transcript of this up there. And, well, if you want to donate, PayPal and Patreon are down below. And have a good one, folks!</p>]]></content:encoded></item><item><title><![CDATA[AMD’s EPYC 9355P: Inside a 32 Core Zen 5 Server Chip]]></title><description><![CDATA[High core count chips are headline grabbers.]]></description><link>https://chipsandcheese.com/p/amds-epyc-9355p-inside-a-32-core</link><guid isPermaLink="false">https://chipsandcheese.com/p/amds-epyc-9355p-inside-a-32-core</guid><dc:creator><![CDATA[Chester Lam]]></dc:creator><pubDate>Tue, 30 Sep 2025 18:27:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!S2Gx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>High core count chips are headline grabbers. But maxing out the metaphorical core count slider isn&#8217;t the only way to go. Server players like Intel, AMD, and Arm aim for scalable designs that cover a range of core counts. Not all applications can take advantage of the highest core count models in their lineups, and per-core performance still matters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!icLd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!icLd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 424w, https://substackcdn.com/image/fetch/$s_!icLd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 848w, https://substackcdn.com/image/fetch/$s_!icLd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 1272w, https://substackcdn.com/image/fetch/$s_!icLd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!icLd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png" width="975" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b52368ac-879d-4088-8e22-ab20f0137f82_975x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:410106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!icLd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 424w, https://substackcdn.com/image/fetch/$s_!icLd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 848w, https://substackcdn.com/image/fetch/$s_!icLd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 1272w, https://substackcdn.com/image/fetch/$s_!icLd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb52368ac-879d-4088-8e22-ab20f0137f82_975x583.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From AMD&#8217;s Turin whitepaper</figcaption></figure></div><p>AMD&#8217;s EPYC 9355P is a 32 core part. But rather than just being a lower core count part, the 9355P pulls levers to let each core count for more. First, it clocks up to 4.4 GHz. AMD has faster clocking chips in its server lineup, but 4.4 GHz is still a good bit higher than the 3.7 or 4.1 GHz that <a href="https://www.amd.com/en/products/processors/server/epyc/9005-series.html#specifications">128 or 192 core Zen 5 SKUs reach</a>. Then, AMD uses eight CPU dies (CCDs) to house those 32 cores. Each CCD only has four cores enabled out of the eight physically present, but still has its full 32 MB of L3 cache usable. That provides a high cache capacity to core count ratio. Finally, each CCD connects to the IO die using a &#8220;GMI-Wide&#8221; setup, giving each CCD 64B/cycle of bandwidth to the rest of the system in both the read and write directions. GMI here stands for Global Memory Interconnect. Zen 5&#8217;s server IO die has 16 GMI links to support up to 16 CCDs for high core count parts, plus some xGMI (external) links to allow a dual socket setup. GMI-Wide uses two links per CCD, fully utilizing the IO die&#8217;s GMI links even though the EPYC 9355P only has eight CCDs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xcV9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xcV9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 424w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 848w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 1272w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xcV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png" width="975" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180582,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xcV9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 424w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 848w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 1272w, https://substackcdn.com/image/fetch/$s_!xcV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ddd15e-c1e8-4f84-95f8-56d9181f7a4d_975x527.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Acknowledgments</h1><p><a href="https://www.dell.com/en-us/shop/dell-poweredge-servers/new-poweredge-r6715-rack-server/spd/poweredge-r6715/pe_r6715_tm_vi_vp_sb">Dell has kindly provided a PowerEdge R6715 for testing</a>, and it came equipped with the aforementioned AMD EPYC 9355P along with 768 GB of DDR5-5200. The 12 memory controllers on the IO die provide a 768-bit memory bus, so the setup provides just under 500 GB/s of theoretical bandwidth. Besides providing a look into how a lower core count SKU behaves, we have BMC access which provides an opportunity to investigate different NUMA setups.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S2Gx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S2Gx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 424w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 848w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 1272w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S2Gx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png" width="452" height="603" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:603,&quot;width&quot;:452,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:604085,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S2Gx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 424w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 848w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 1272w, https://substackcdn.com/image/fetch/$s_!S2Gx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c0cf47-bc97-4634-8c86-82d0d5c2f829_452x603.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The provided Dell PowerEdge R6715 system racked and ready for testing. Credit for the image goes to Zach from ZeroOne.</figcaption></figure></div><p>We&#8217;d also like to thank Zack and the rest of the fine folks at ZeroOne Technology for hosting the Dell PowerEdge R6715 at no cost to us.</p><h1>Memory Subsystem and NUMA Characteristics</h1><p>NPS1 mode stripes memory accesses across all 12 of the chip&#8217;s memory controllers, presenting software with a monolithic view of memory at the cost of latency. DRAM latency in that mode is slightly better than what Intel&#8217;s Xeon 6 achieves in SNC3 mode. SNC3 on Intel divides the chip into three NUMA nodes that correspond to its compute dies. The EPYC 9355P has good memory latency in that respect, but it falls behind compared to the Ryzen 9 9900X with DDR5-5600. Interconnects that tie more agents together tend to have higher latency. On AMD&#8217;s server platform, the Infinity Fabric network within the IO die has to connect up to 16 CCDs with 12 memory controllers and other IO, so the higher latency isn&#8217;t surprising.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mbb3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mbb3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 424w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 848w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 1272w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mbb3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png" width="975" height="427" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:427,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72295,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mbb3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 424w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 848w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 1272w, https://substackcdn.com/image/fetch/$s_!Mbb3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51fed7c-b5d2-4b4c-8a43-4a55ce28914d_975x427.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cache performance is similar across AMD&#8217;s desktop and server Zen 5 implementations, with the server variant only losing because of lower clock speeds. That&#8217;s not a surprise because AMD reuses the same CCDs on desktop and server products. But it does create a contrast to Intel&#8217;s approach, where client and server memory subsystems differ starting at L3. Intel trades L3 latency for capacity and the ability to a logical L3 instance across more cores.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QONl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QONl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 424w, https://substackcdn.com/image/fetch/$s_!QONl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 848w, https://substackcdn.com/image/fetch/$s_!QONl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 1272w, https://substackcdn.com/image/fetch/$s_!QONl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QONl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png" width="975" height="227" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/adf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:227,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QONl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 424w, https://substackcdn.com/image/fetch/$s_!QONl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 848w, https://substackcdn.com/image/fetch/$s_!QONl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 1272w, https://substackcdn.com/image/fetch/$s_!QONl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadf03a30-37d4-4d15-9c01-690fe0305f95_975x227.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Different NUMA configurations can subdivide EPYC 9355P, associating cores with the closest memory controllers to improve latency. NPS2 divides the chip into two hemispheres, and has 16 cores form a NUMA node with the six memory controllers on one half of the die. NPS4 divides the chip into quadrants, each with two CCDs and three memory controllers. Finally, the chip can present each CCD as a NUMA node. Doing so makes it easier to pin threads to cores that share a L3 cache, but doesn&#8217;t affect how memory is interleaved across channels. Memory addresses are still assigned to memory controllers according to the selected NPS1/2/4 scheme, which is a separate setting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3suW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3suW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 424w, https://substackcdn.com/image/fetch/$s_!3suW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 848w, https://substackcdn.com/image/fetch/$s_!3suW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 1272w, https://substackcdn.com/image/fetch/$s_!3suW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3suW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png" width="975" height="327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dcb62202-f156-4446-82db-d117584de0c2_975x327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:327,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:162136,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3suW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 424w, https://substackcdn.com/image/fetch/$s_!3suW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 848w, https://substackcdn.com/image/fetch/$s_!3suW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 1272w, https://substackcdn.com/image/fetch/$s_!3suW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcb62202-f156-4446-82db-d117584de0c2_975x327.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>NPS2 and NPS4 only provide marginal latency improvements, and latency remains much higher than in a desktop platform. At the same time, crossing NUMA boundaries comes with little penalty. Apparently requests can traverse the huge IO die quite quickly, adding 20-30 ns at worst. I&#8217;m not sure what the underlying Infinity Fabric topology looks like, but the worst case unloaded cross-node latencies were under 140 ns. On Xeon 6, latency starts higher and can climb over 180 ns when cores on one compute die access memory controllers on the compute die at the other end of the chip.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3AET!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3AET!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 424w, https://substackcdn.com/image/fetch/$s_!3AET!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 848w, https://substackcdn.com/image/fetch/$s_!3AET!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 1272w, https://substackcdn.com/image/fetch/$s_!3AET!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3AET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png" width="975" height="469" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:469,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121481,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3AET!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 424w, https://substackcdn.com/image/fetch/$s_!3AET!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 848w, https://substackcdn.com/image/fetch/$s_!3AET!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 1272w, https://substackcdn.com/image/fetch/$s_!3AET!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cb07f7e-9a74-4586-b2cf-793668b357d3_975x469.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>EPYC 9355P can get close to theoretical memory bandwidth in any of the three NUMA nodes, as long as code keeps accesses within each node. NPS2 and NPS4 offer slightly better bandwidth, at the cost of requiring code to be NUMA aware. I tried to cause congestion on Infinity Fabric by having cores on each NUMA node access memory on another. That does lower achieved bandwidth, but not by a huge amount.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZwBS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZwBS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 424w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 848w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 1272w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZwBS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png" width="975" height="288" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:288,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102088,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZwBS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 424w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 848w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 1272w, https://substackcdn.com/image/fetch/$s_!ZwBS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb124a60-3a1c-4557-8a24-e9da65562edd_975x288.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">By C0M1, I mean cores on node 0 accessing memory on node 1</figcaption></figure></div><p>An individual NPS4 node achieves 117.33 GB/s to its local memory pool, and just over 107 GB/s to the memory on the other three nodes. The bandwidth penalty is minor, but a bigger potential pitfall is lower bandwidth to each NUMA node&#8217;s memory pool. Two CCDs can draw more bandwidth than the three memory controllers they&#8217;re associated with. Manually distributing memory accesses across NUMA nodes can improve bandwidth for a workload contained within one NUMA node&#8217;s cores. But doing so in practice may be an intricate exercise.</p><p>In general, EPYC 9355P has very mild NUMA characteristics and little penalty associated with running the chip in NPS1 or NPS2 mode. I imagine just using NPS1 mode would work well enough in the vast majority of cases, with little performance to be gained from carrying out NUMA optimizations.</p><h1>Looking into GMI-Wide</h1><p>GMI-Wide is AMD&#8217;s attempt to address bandwidth pinch points between CCDs and the rest of the system. With GMI-Wide, a single CCD can achieve 99.8 GB/s of read bandwidth, significantly more than the 62.5 GB/s from a Ryzen 9 9900X CCD with GMI-Narrow. GMI-Wide also allows better latency control under high bandwidth load. The Ryzen 9 9900X suffers from a corner case where a single core pulling maximum bandwidth can saturate the GMI-Narrow link and starve out another latency sensitive thread. That sends latency to nearly 500 ns, as observed by a latency test thread sharing a CCD with a thread linearly traversing an array. Having more threads generate bandwidth load seems to make QoS mechanisms kick in, which slightly reduces bandwidth throughput but brings latency back under control.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KTCF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KTCF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 424w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 848w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 1272w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KTCF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png" width="975" height="529" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:529,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:95917,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KTCF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 424w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 848w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 1272w, https://substackcdn.com/image/fetch/$s_!KTCF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a88e138-9963-4f94-99e7-64854c656bc8_975x529.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I previously wrote about loaded memory latency on the Ryzen 9 9950X when testing the system remotely, and <a href="https://chipsandcheese.com/p/pushing-amds-infinity-fabric-to-its">thought it controlled latency well under high bandwidth load</a>. But back then, George (Cheese) set that system up with very fast DDR5-8000 along with a higher 2.2 GHz FCLK. A single core was likely unable to monopolize off-CCD bandwidth in that setup, avoiding the corner case seen on my system. GMI-Wide increases off-CCD bandwidth by a much larger extent and has a similar effect. Under increasing bandwidth load, GMI-WIde can both achieve more total bandwidth and control latency better than its desktop single-link counterpart.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uZWd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uZWd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 424w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 848w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 1272w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uZWd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png" width="975" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102239,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uZWd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 424w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 848w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 1272w, https://substackcdn.com/image/fetch/$s_!uZWd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff08156e0-49a7-4913-a66a-b104c4f042f5_975x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A read-modify-write pattern gets maximum bandwidth from GMI-Wide by exercising both the read and write paths. It doesn&#8217;t scale perfectly, but it&#8217;s a substantial improvement over using only reads or writes. A Ryzen 9 9900X CCD can theoretically get 48B/cycle to the IO die with a 2:1 read-to-write ratio. I tried modifying every other cacheline to achieve this ratio, but didn&#8217;t get better bandwidth probably because the memory controller is limited by a 32B/cycle link to Infinity Fabric. However, mixing in writes does get rid of the single bandwidth thread corner case, possibly because a single thread doesn&#8217;t saturate the 32B/cycle read link when mixing reads and writes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!388G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!388G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 424w, https://substackcdn.com/image/fetch/$s_!388G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 848w, https://substackcdn.com/image/fetch/$s_!388G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 1272w, https://substackcdn.com/image/fetch/$s_!388G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!388G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png" width="975" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48763b64-244f-483b-a126-e9c079e52ee5_975x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:189525,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!388G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 424w, https://substackcdn.com/image/fetch/$s_!388G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 848w, https://substackcdn.com/image/fetch/$s_!388G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 1272w, https://substackcdn.com/image/fetch/$s_!388G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48763b64-244f-483b-a126-e9c079e52ee5_975x546.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AMD&#8217;s slide shown for Zen 4 Desktop. Zen 5 Desktop uses the same IO die, and thus has the same Infinity Fabric setup</figcaption></figure></div><p>On the desktop platform, latency under high load gets worse possibly because writes contend with reads at the DRAM controller. The DDR bus is unidirectional, and must waste cycles on &#8220;turnarounds&#8221; to switch between read and write mode. Bandwidth isn&#8217;t affected, probably because the Infinity Fabric bottleneck leaves spare cycles at the memory controller, which can absorb those turnarounds. However, reads from the latency test thread may be delayed while the memory controller drains writes before switching the bus back to read mode.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R21G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R21G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 424w, https://substackcdn.com/image/fetch/$s_!R21G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 848w, https://substackcdn.com/image/fetch/$s_!R21G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 1272w, https://substackcdn.com/image/fetch/$s_!R21G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R21G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png" width="975" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:103650,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R21G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 424w, https://substackcdn.com/image/fetch/$s_!R21G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 848w, https://substackcdn.com/image/fetch/$s_!R21G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 1272w, https://substackcdn.com/image/fetch/$s_!R21G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b3ef84b-2970-400a-a509-5811a97b1746_975x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On the EPYC 9355P in NPS1 mode, bandwidth demands from a single GMI-Wide CCD leave plenty of spare cycles across the 12 memory controllers, so there&#8217;s little latency or bandwidth penalty when mixing reads and writes. The same isn&#8217;t true in NPS4 mode, where a GMI-Wide link can outmatch a NPS4 node&#8217;s three memory controllers. Everything&#8217;s fine with just reads, which actually benefit possibly because of lower latency and not having to traverse as much of the IO die. But with a read-modify-write pattern, bandwidth drops from 134 GB/s in NPS1 mode to 96.6 GB/s with NPS4. Latency gets worse too, rising to 248 ns. Again, NPS4 is something to be careful with, particularly if applications might require high bandwidth from a small subset of cores.</p><h1>SPEC CPU2017</h1><p>From a single thread perspective, the EPYC 9355P falls some distance behind the Ryzen 9 9900X. Desktop CPUs are designed around single threaded performance, so that&#8217;s to be expected. But with boost turned off on the desktop CPU to match clock speeds, performance is surprisingly close. Higher memory latency still hurts the EPYC 9355P, but it&#8217;s within striking distance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S4dS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S4dS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 424w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 848w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 1272w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S4dS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png" width="975" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102023,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S4dS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 424w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 848w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 1272w, https://substackcdn.com/image/fetch/$s_!S4dS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56176f96-0c6f-465b-9c2c-7b927fe48aaa_975x583.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>NUMA modes barely make any difference. NPS4 technically wins, but by an insignificant margin. The latency advantage was barely measurable anyway. Compared to the more density optimized Graviton 4 and Xeon 6 6975P-C, the EPYC 9355P delivers noticeably better single threaded performance.</p><p>CCD-level bandwidth pinch points are worth a look too, since that&#8217;s traditionally been a distinguishing factor between AMD&#8217;s EPYC and more logically monolithic designs. Here, I&#8217;m filling a quad core CCD by running SPEC CPU2017&#8217;s rate tests with eight copies. I did the same on the Ryzen 9 9900X, pinning the eight copies to four cores and leaving the CCD&#8217;s other two cores unused. I bound the test to a single NUMA node on all tested setups.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W9Bj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W9Bj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 424w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 848w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 1272w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W9Bj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png" width="975" height="585" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:585,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:76855,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W9Bj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 424w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 848w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 1272w, https://substackcdn.com/image/fetch/$s_!W9Bj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec837352-d5f8-42bf-aa2e-07f4f4fdfa74_975x585.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>SPEC&#8217;s floating point suite starts to tell a different story now. Several tests within the floating point suite are bandwidth hungry even from a single core. 549.fotonik3d for example pulled 28.23 GB/s from Meteor Lake&#8217;s memory controller<a href="https://chipsandcheese.com/p/running-spec-cpu2017-at-chips-and-cheese"> when I first went through SPEC CPU2017&#8217;s tests</a>. Running eight copies in parallel would multiply memory bandwidth demands, and that&#8217;s where server memory subsystems shine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5l58!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5l58!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 424w, https://substackcdn.com/image/fetch/$s_!5l58!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 848w, https://substackcdn.com/image/fetch/$s_!5l58!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 1272w, https://substackcdn.com/image/fetch/$s_!5l58!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5l58!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png" width="975" height="585" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:585,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74605,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5l58!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 424w, https://substackcdn.com/image/fetch/$s_!5l58!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 848w, https://substackcdn.com/image/fetch/$s_!5l58!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 1272w, https://substackcdn.com/image/fetch/$s_!5l58!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44331b3c-ed06-4b9e-ba70-db7d7d34815e_975x585.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In 549.fotonik3d, high bandwidth demands make the Ryzen 9 9900X&#8217;s unloaded latency advantage irrelevant. The 9900X even loses to Redwood Cove cores on Xeon 6. The EPYC 9355P does very well in this test against both the 9900X and Xeon 6. Intel&#8217;s interconnect strategy tries to keep the chip logically monolithic and doesn&#8217;t have pinch points at cluster boundaries. But each core on Xeon 6 can only get to ~33 GB/s of DRAM bandwidth at best, using an even mix of reads and writes. AMD&#8217;s GMI-Wide can more than match that, and Intel&#8217;s advantage doesn&#8217;t show through in this scenario. However, Intel does have a potential advantage against more density focused AMD SKUs where eight cores sit in front of a narrower link.</p><p>NPS4 is also detrimental to the EPYC 9355P&#8217;s performance in this test. It only provides a minimal latency benefit at the cost of lower per-node bandwidth. The bandwidth part seems to hurt here, and taking the extra latency of striping accesses across 6 or 12 memory controllers gives a notable performance improvement.</p><h1>Final Words</h1><p>Core count isn&#8217;t the last word in server design. A lot of scenarios are better served by lower core count parts. Applications might not scale to fill a high core count chip. Bandwidth bound workloads might not benefit from adding cores. Traditionally lower core count server chips just traded core counts for higher clock speeds. Today, chips like the EPYC 9355P do a bit more, using both wider CCD-to-IOD links and more cache to maximize per-core performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7uUJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7uUJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 424w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 848w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 1272w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7uUJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png" width="975" height="544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:544,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:334371,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7uUJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 424w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 848w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 1272w, https://substackcdn.com/image/fetch/$s_!7uUJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c58c9a-c90f-4452-8dda-c6d3b5fff36d_975x544.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Looking at EPYC 9355P&#8217;s NUMA characteristics reveals very consistent memory performance across NUMA modes. Intel&#8217;s Xeon 6 may be more monolithic from a caching point of view, but AMD&#8217;s DRAM access performance feels more monolithic than Intel&#8217;s. AMD made a tradeoff back in the Zen 2 days where they took lower local memory latency in exchange for more even memory performance across the socket. Measured latencies on EPYC 9355P are a bit higher than figures on the Zen 2 slide above. DDR5 is higher latency, and the Infinity Fabric topology is probably more complex these days to handle more CCDs and memory channels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Rh7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Rh7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 424w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 848w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 1272w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Rh7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png" width="975" height="525" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:525,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:567064,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Rh7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 424w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 848w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 1272w, https://substackcdn.com/image/fetch/$s_!4Rh7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdbeb54c-c11b-44fb-8398-9a70ce800e06_975x525.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From the AMD EPYC 9005 Processor Architecture Overview</figcaption></figure></div><p>But the big picture remains. AMD&#8217;s Turin platform handles well in NPS1 mode, and cross-node penalties are low in NPS2/NPS4 modes. Those characteristics likely carry over across the Zen 5 EPYC SKU stack. It&#8217;s quite different from Intel&#8217;s Xeon 6 platform, which places memory controllers on compute dies like Zen 1 did. For now, AMD&#8217;s approach seems to be better at the DRAM level. Intel&#8217;s theoretical latency advantage in SNC3 mode doesn&#8217;t show through, and AMD gets to reap the benefits of a hub-and-spoke model while not getting hit where it should have downsides.</p><p>AMD seems to have found a good formula back in the Zen 2 days, and they&#8217;re content with reinforcing success. Intel is furiously iterating to find a setup that preserves a single level, logically monolithic interconnect while scaling well across a range of core counts. And of course, there&#8217;s Arm chips, which generally lean towards a single level monolithic interconnect too. It&#8217;ll be interesting to watch what all of these players do going forward as they continue to iterate and refine their designs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vt3F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vt3F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 424w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 848w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 1272w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vt3F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png" width="975" height="250" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:975,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:409739,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://chipsandcheese.com/i/174871357?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vt3F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 424w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 848w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 1272w, https://substackcdn.com/image/fetch/$s_!Vt3F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec50b5cd-e0f1-4c05-9c6f-a9fa59889fbf_975x250.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A big thanks to Zach from ZeroOne Technology for taking pictures of the system in the rack!</figcaption></figure></div><p>And again, we&#8217;d like to thank both Dell and ZeroOne for, respectively, providing and hosting <a href="https://www.dell.com/en-us/shop/dell-poweredge-servers/new-poweredge-r6715-rack-server/spd/poweredge-r6715/pe_r6715_tm_vi_vp_sb">this PowerEdge R6715</a> without both of whom this article wouldn&#8217;t have been possible. </p><p>If you like the content then consider heading over to the <a href="https://www.patreon.com/ChipsandCheese">Patreon</a> or <a href="https://www.paypal.com/donate/?hosted_button_id=4EMPH66SBGVSQ">PayPal</a> if you want to toss a few bucks to Chips and Cheese. Also consider joining the <a href="https://discord.gg/TwVnRhxgY2">Discord</a>.</p>]]></content:encoded></item></channel></rss>