Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I obtain the amount of allocated dynamic shared memory from within a kernel?

On the host side, I can save the amount of dynamic shared memory I intend to launch a kernel with, and use it. I can even pass that as an argument to the kernel. But - is there a way to get it directly from device code, without help from the host side? That is, have the code for a kernel determine, as it runs, how much dynamic shared memory it has available?

like image 929
einpoklum Avatar asked Sep 06 '25 18:09

einpoklum


1 Answers

Yes, there's a special register holding that value, named %dynamic_smem_size. You can obtain this register's value in your CUDA C/C++ code by wrapping some inline PTX with a getter function:

__device__ unsigned dynamic_smem_size()
{
    unsigned ret; 
    asm volatile ("mov.u32 %0, %dynamic_smem_size;" : "=r"(ret));
    return ret;
}

You can similarly obtain the total size of allocated shared memory (static + dynamic) from the register %total_smem_size.

Note: reading a special register does cost a few cycles. If you can compute this value apriori, you'll save that read, plus you might squeeze something out of the optimzier knowing that value.

like image 102
einpoklum Avatar answered Sep 11 '25 02:09

einpoklum