7 Comments
User's avatar
Dean's avatar

Hi Chester, thanks for all the fantastic work you and the team do! In line with this article and others that you have made, I'm wondering if you would be able to create an article outlining all the main system level/macro-architecture designs that have existed in the last 20 years or so [at least at a high level]. By macro-architecture, I mean system level design such as comparing more monolithic systems such as soc's, all the way to desktop level systems which of course contain the very component you describe here in the article. The various types of interconnects that have existed over time such as xbar, ring, mesh ect and some of the main systems that have made use of them, the benefits and drawbacks of the different approaches? Also, stuff such as dma and how that differs to a more monolithic soc system where dma is less needed due to all of the system agents on the soc sharing memory access through the imc ?[to my knowledge]. I love micro-architecture, but its my belief [at least for me anyway] that knowledge of a modern system at the topology/macro-architectural level provides a lot more return on investment when it comes to understanding how modern systems work and i feel as though there isn't too much content out there that really attempts to tie all the components of the system together in a cohesive way!

Peter W.'s avatar

Thanks Chester! Somewhat OT from chipsets, but I wonder if you might have the time and opportunity to briefly revisit Arrow Lake, specifically the 270K. Here's why: Intel claims that they were able to increase die-to-die (tile-to-tile) throughput and reduce the latencies for die-to-die traffic quite significantly.

Since Nova Lake will also be a multi-tile design, it might be interesting to see if Intel has made some meaningful progress here.

On the other hand, if you already have a Panther Lake deep dive lined up, I'll gladly table my suggestion regarding the 270K 😄. If someone at Intel is reading this, send one of your Panther Lake-based edge devices to Chester. They might be even more interesting than a notebook!

Chester Lam's avatar

So far no Panther Lake or Nova Lake in the plan (don't have either on hand), though I agree they're interesting

Schrödinger's Cat's avatar

Not to be a bother, but I wonder if you saw my offer of a RTX 5070 that I made in the comments of the GB10 GPU article?

Blaine Gaither's avatar

On a related note, in 2010 HP was experimenting with SSDs on TPC-C systems. TPC-C is always CPU bound inpractice on Comercial benchmark systems. The benchmarking teasm approached me because they observed a strange (good strange) impact of switch to SSDs from rotating rust. They observed a 6% performance increase. “How does changing only the disk drive increase performance?” Well the explantion become clear when you look at chache redidency time. https://www.researchgate.net/publication/279913396_Why_Does_Solid_State_Disk_Lower_CPI

Chester Lam's avatar

Nice, a lower thread footprint (less async-ness) gives better cache locality

Peter W.'s avatar

That was/is an interesting study (I downloaded it 😃). Would be interesting if a similar (or greater?) speed benefit is seen if one uses a RAM disk. While not applicable to HPC, it's something that can help limiting the wear on the NAND in applications like video editing.