AMD really needs a formal Whitepaper with diagrams and such that goes into detail about this IP! There has to be more Whitepaper content published by AMD going forward or it's not going to be easy to see the advantage on the work that AMD is doing!
Agree! Especially for a "halo" product like this. At least some more details about the architecture and what considerations went into it would be appreciated!
So he said that single CCD can pull 256 GB/s bandwidth instead 64 GB/s "normal zen5" can pull, and has lower latency than desktop zen5, can you please verify this when you do your Strix Halo tests,
also I hope we get RDNA4 and 5090 tests ( compared to RDNA3 and 4090) soon, most other sites do tests from gaming standpoint, you are only one doing serious GPU compute tests a la AnandTech R.I.P.
I don't think you're going to see lower latency in practice because of the memory being used. The latency between the CCDs and the SoC die can very well be significantly lower and you're still going to see higher memory latency just because LPDDR is generally pretty horrible in latency.
I don't understand at all how the shared memory system works with AMD iGPUs. They say that the GPU can maximally access 96 GB, but why is there such a limit at all? The GPU uses virtual memory, doesn't it? Why can't you map GPU pages to any physical address? Why is there a need to "pre-allocate" memory to the GPU at all, especially on the BIOS level? Why wouldn't you just allocate your GPU buffers the exact same way that you allocate any other memory, and just direct the GPU to the right physical addresses? Is it just some sort of driver shortcoming?
Also, I was really surprised that the CCDs aren't shared with desktop! I really wasn't expecting them to tape out a whole new CCD just for this line. Is there any expected reuse for any other product line?
While I really liked the interview, I would have liked to also hear your guest's thoughts and comments on the similarities and differences between Strix Halo and the large M SoCs (APUs) from Apple.
As a request for the hopefully upcoming review: please test also how much of a bottleneck memory bandwidth is for especially GPU performance for both Strix Halo and - if possible and feasible - one or two other X86 SoCs for mobile. Thanks!
I know that😀; however, it would have been interesting to hear, for example, how AMD sees that approach (monolith), and why they believe theirs is better. The key challenge with using chiplets (tiles) is that it tends to be less power efficient than monoliths, and also introduce additional latencies. The key challenge with monoliths is that even with the best fabrication, yield becomes more and more of an issue the larger the transistor number gets. Some of that was touched on, but I would have liked to hear a bit more.
AMD really needs a formal Whitepaper with diagrams and such that goes into detail about this IP! There has to be more Whitepaper content published by AMD going forward or it's not going to be easy to see the advantage on the work that AMD is doing!
Agree! Especially for a "halo" product like this. At least some more details about the architecture and what considerations went into it would be appreciated!
So he said that single CCD can pull 256 GB/s bandwidth instead 64 GB/s "normal zen5" can pull, and has lower latency than desktop zen5, can you please verify this when you do your Strix Halo tests,
also I hope we get RDNA4 and 5090 tests ( compared to RDNA3 and 4090) soon, most other sites do tests from gaming standpoint, you are only one doing serious GPU compute tests a la AnandTech R.I.P.
I don't think you're going to see lower latency in practice because of the memory being used. The latency between the CCDs and the SoC die can very well be significantly lower and you're still going to see higher memory latency just because LPDDR is generally pretty horrible in latency.
Still no cheese reviews!
This is depressing! How
about some fish too?
Fish and Chips sounds
more appetizing than
chips and cheese! ;-)
I don't understand at all how the shared memory system works with AMD iGPUs. They say that the GPU can maximally access 96 GB, but why is there such a limit at all? The GPU uses virtual memory, doesn't it? Why can't you map GPU pages to any physical address? Why is there a need to "pre-allocate" memory to the GPU at all, especially on the BIOS level? Why wouldn't you just allocate your GPU buffers the exact same way that you allocate any other memory, and just direct the GPU to the right physical addresses? Is it just some sort of driver shortcoming?
Also, I was really surprised that the CCDs aren't shared with desktop! I really wasn't expecting them to tape out a whole new CCD just for this line. Is there any expected reuse for any other product line?
While I really liked the interview, I would have liked to also hear your guest's thoughts and comments on the similarities and differences between Strix Halo and the large M SoCs (APUs) from Apple.
As a request for the hopefully upcoming review: please test also how much of a bottleneck memory bandwidth is for especially GPU performance for both Strix Halo and - if possible and feasible - one or two other X86 SoCs for mobile. Thanks!
Apple's monolithic, with accelerators integrated into its design for specific tasks
I know that😀; however, it would have been interesting to hear, for example, how AMD sees that approach (monolith), and why they believe theirs is better. The key challenge with using chiplets (tiles) is that it tends to be less power efficient than monoliths, and also introduce additional latencies. The key challenge with monoliths is that even with the best fabrication, yield becomes more and more of an issue the larger the transistor number gets. Some of that was touched on, but I would have liked to hear a bit more.