Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCL get_local_id Cost in Speed/Registers

Tags:

opencl

In an OpenCL kernel, is it slower to call get_local_id(0) several times, instead of calling it once at the top - and later accessing that value from a local variable?

Do I use one fewer register by not storing get_local_id(0) to a local variable, and instead calling it several times?

like image 678
benshope Avatar asked Jan 18 '26 13:01

benshope


1 Answers

On most GPU architectures the local work item id will be held in a special set of registers that are set up by the hardware as it dispatches threads. This means that when you do a get_local_id(0) you are actually just reading from a register anyway, i.e. no speed loss 'calling' get_local_id(0).

The compiler should notice if you assign get_local_id(0) to a variable for subsequent use and generate the same code as if you'd get_local_id(0) every time. However, in my experience this is not always the case and you will notice exactly one register being saved by not using the variable to store the local id.

If you are tight for registers, I would suggest calling get_local_id(0) every time. You could even set up a macro to hide whether get_local_id(0) is being called or a register is being used. If you've got registers to spare (your profiler will tell you this) or prefer your code using a variable then go with a variable.

like image 54
user2746401 Avatar answered Jan 21 '26 07:01

user2746401



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!