Discussion about this post

User's avatar
Yoshi's avatar
8hEdited

Your code seems to do 20 instructions, two blocks of 10 instructions. So the throughput will actually be min(10/latency, throughput) which corresponds to the 2.5 result above. Doing 16 independent instructions make the throughput goes to 4 IPC as advertised.

Some other tests are showing 2.53 too.

No posts

Ready for more?