2 Comments

Great article, loved the humor! A quick question regarding your block diagrams for Terrascale's simd engines. The layout of the ALUs from what AMD had always provided suggests the simd blocks are configured as XYZW *16, while yours look like X*16, Y*16, Z*16, W*16.

Expand full comment

If it helps, each instruction bundle contains instructions for X, Y, Z, and W positions. Each instruction is for a 64-wide vector, and executes over four cycles. Another way of looking at it is, each SIMD does 16x VLIW4 operations per cycle

Expand full comment