6 Comments
User's avatar
David. Hellyx's avatar

I'd love to see a detailed analysis of architectures between RDNA3 vs RDNA4 vs ADA vs Blackwell

Expand full comment
KozakMaks's avatar

Amd ai accelerators vs tensor cores

Expand full comment
jozsef's avatar

Could you analyze this out of order memory more?

If i understand correctly, this is not a traditional cpu like out of order resource, because it doesnt exploit instruction level parallelism, only inter warps memory parallelism.

Do I see it right?

Expand full comment
jozsef's avatar

I'd like to see more rdna 4 architecture analysis, because IMHO rdna 4 is the most interesting gpu architecture since gcn. I am curiously interested in rdna 4 dynamic register allocation and out of order capapilities. Especially dynamic register allocation from software perspective, thinking about deadlocks, what are mentioned in rdna 4 instruction set architecture pdf.

Thanks in advance!

Expand full comment
valentin's avatar

could you put charts up that compare it directly to the following dies N33,N32,N31 and N21 ?

N33 - the only monolithic big RDNA3 GPU (Except Viola but that is on N4P and on a Platform (PS5 Pro) where you cant do micro benchmarking and analysis)

N32 - the revised version of N31 (smaller caches per WGP and Array than N31), and overall the roughly the same size as N48

N31 - specifically the 7900GRE as it has almost the same number of transistors.

N21 - 2x the L3 cahce, monolithic and just for an overall overview on how RDNA developed over the course of the last 3 gens.

most interestingly would be N33 or N32 vs N48 when it comes to the caches

Expand full comment
jozsef's avatar

So, do i see it wrong, or this out of order memory does not exploit instruction level parallelism within a thread? Anyone an answer???

Expand full comment