"Transforming a ray would involve multiplying both the origin and direction vector by a 3×3 rotation matrix, which naively requires 36 FLOPs (Floating Point Operations) per transform."
There's an error in your calculations. A naive 3x3 matrix-vector multiplication without FMA requires 15 operations, and with FMA it reduces down to 9 operations.
To multiply a matrix by a vector, you multiply corresponding elements in each row with the vector. For a 3x3 matrix, that's 3 multiply-accumulates per row, or 9 FMA operations. Each FMA operation is two FLOPS, because it's a multiply and an add, so 18 FLOPS per matrix multiply.
Now you have to rotate both the origin and direction vector, not just one of them, so 18*2=36 FLOPS
Do you think current games are effectively utilizing RDNA4's new ray tracing–focused features? Some titles seem so poorly optimized that RDNA3 and RDNA4 perform almost the same. Indiana Jones and Wukong are the two worst offenders.
Love the article, but find it a bit disheartening how little NVidia is mentioned with their approach in comparison. Even Intel got more attention. Is this caused by lack of more solid info about NV approach or?..
"Transforming a ray would involve multiplying both the origin and direction vector by a 3×3 rotation matrix, which naively requires 36 FLOPs (Floating Point Operations) per transform."
There's an error in your calculations. A naive 3x3 matrix-vector multiplication without FMA requires 15 operations, and with FMA it reduces down to 9 operations.
To multiply a matrix by a vector, you multiply corresponding elements in each row with the vector. For a 3x3 matrix, that's 3 multiply-accumulates per row, or 9 FMA operations. Each FMA operation is two FLOPS, because it's a multiply and an add, so 18 FLOPS per matrix multiply.
Now you have to rotate both the origin and direction vector, not just one of them, so 18*2=36 FLOPS
Right. I guess that he counted one addition per multiplication, instead of 2 additions per 3 multiplications. Should have been 30, not 36.
Do you think current games are effectively utilizing RDNA4's new ray tracing–focused features? Some titles seem so poorly optimized that RDNA3 and RDNA4 perform almost the same. Indiana Jones and Wukong are the two worst offenders.
Love the article, but find it a bit disheartening how little NVidia is mentioned with their approach in comparison. Even Intel got more attention. Is this caused by lack of more solid info about NV approach or?..
yea lack of info
Dam, really sad. Still, was great to read about AMD approach, even in somewhat isolation!
Great article (as always). I enjoyed the deep dive.
Note that the Final Words section starts with "RDNA 2 brought introduced AMD’s ...".
fixed