
Raytracing on AMD’s RDNA 2/3, and Nvidia’s Turing and Pascal
Raytracing aims to accurately model light by following the paths of light rays and seeing what they hit. However, raytracing is quite expensive compared to say, calculating light values by taking a pixel’s distance to light sources and doing an inverse square root. You…

Sapphire Rapids: Golden Cove Hits Servers
Last year, Intel’s Golden Cove brought the company back to competitive against AMD. Unfortunately for Intel, that only applied to the client space. AMD’s Rome and Milan CPUs have steadily eroded Intel’s server dominance. With the bulk of recently added systems on TOP500 using…

Van Gogh, AMD’s Steam Deck APU
Zen 2’s launch was a defining moment for AMD. For the first time in many, many years, AMD’s single thread performance could go head to head with Intel’s best. Zen 2 also started a trend where AMD brought up to 16 cores to desktop…

Loongson’s LSX and LASX Vector Extensions
Loongson used to make CPUs based off the MIPS ISA, but the company recently switched to a homegrown ISA called Loongarch. This “new” ISA retains many of MIPS’s semantics, but uses incompatible encodings. Loongarch also gets extended to better support Loongson’s goals of making…

AMD’s RDNA 2: Shooting For the Top
In 2019, AMD moved off their long-serving GCN architecture in favor of RDNA. We’ll cover the first generation of RDNA some other time. RDNA 2 takes that foundation and scales it up while adding raytracing support and a few other enhancements. We already covered…

Intel’s Dunnington: Core 2 Goes Dun Dun Dun
After Conroe’s launch in 2006, Intel had an excellent core architecture (Merom). They smacked AMD around in the client space where single threaded performance was king. But in the server market, multithreaded performance is extremely important. There, AMD still held the advantage because their…

Previewing China’s Loongson 3A5000 with Performance Counters
Loongson’s 3A5000 represents another domestic CPU effort from China. It implements four LA464 cores, and targets everything from desktops to servers to embedded applications. Like the Zhaoxin KX-6640MA and Phytium D2000 that we covered previously, Loongson’s chip runs at low clock speeds. But unlike…

Bulldozer, AMD’s Crash Modernization: Caching and Conclusion
In Part 1, we looked at Bulldozer’s core architecture. But the architecture itself isn’t the full story. Memory advances have not kept up with CPU speed, so modern CPUs cope with increasingly sophisticated caching setups. They have to cope with cache latency as well,…

Bulldozer, AMD’s Crash Modernization: Frontend and Execution Engine
AMD’s K7 Athlon architecture formed the basis of the company’s CPU offerings for around a decade. Athlon did very well against Intel’s P6 based Pentium III. K8 got the basics right, introduced 64-bit support and an integrated memory controller. And remained reasonably competitive against…

Golden Cove’s Vector Register File: Checking with Official (SPR) Data
In late December 2022, we published an article going over how Intel optimized Golden Cove’s vector register file to handle AVX-512 while minimizing area overhead. A few days ago, Intel published data on Sapphire Rapids. With that info, we can have a look at…
Loading…
Something went wrong. Please refresh the page and/or try again.