9 Comments
User's avatar
anon's avatar

"Transforming a ray would involve multiplying both the origin and direction vector by a 3×3 rotation matrix, which naively requires 36 FLOPs (Floating Point Operations) per transform."

There's an error in your calculations. A naive 3x3 matrix-vector multiplication without FMA requires 15 operations, and with FMA it reduces down to 9 operations.

Expand full comment
Chester Lam's avatar

To multiply a matrix by a vector, you multiply corresponding elements in each row with the vector. For a 3x3 matrix, that's 3 multiply-accumulates per row, or 9 FMA operations. Each FMA operation is two FLOPS, because it's a multiply and an add, so 18 FLOPS per matrix multiply.

Now you have to rotate both the origin and direction vector, not just one of them, so 18*2=36 FLOPS

Expand full comment
ET3D's avatar

Right. I guess that he counted one addition per multiplication, instead of 2 additions per 3 multiplications. Should have been 30, not 36.

Expand full comment
Dante Fr.'s avatar

Do you think current games are effectively utilizing RDNA4's new ray tracing–focused features? Some titles seem so poorly optimized that RDNA3 and RDNA4 perform almost the same. Indiana Jones and Wukong are the two worst offenders.

Expand full comment
GARGEAN's avatar

Love the article, but find it a bit disheartening how little NVidia is mentioned with their approach in comparison. Even Intel got more attention. Is this caused by lack of more solid info about NV approach or?..

Expand full comment
Chester Lam's avatar

yea lack of info

Expand full comment
GARGEAN's avatar

Dam, really sad. Still, was great to read about AMD approach, even in somewhat isolation!

Expand full comment
ET3D's avatar

Great article (as always). I enjoyed the deep dive.

Note that the Final Words section starts with "RDNA 2 brought introduced AMD’s ...".

Expand full comment
Chester Lam's avatar

fixed

Expand full comment