I have a very simple CUDA component in my application. Valgrind reports a lot of leaks and still-reachables, all related to the cudaMalloc calls.
Are these leaks real? I call cudaFree for every cudaMalloc.  Is this valgrind's inability to interpret GPU memory allocation? If these leaks are not real, can I suppress them and have valgrind only analyse the non-gpu part of the application?
extern "C"
unsigned int *gethash(int nodec, char *h_nodev, int len) {
    unsigned int *h_out = (unsigned int *)malloc(sizeof(unsigned int) * nodec);
    char *d_in;
    unsigned int *d_out;
    cudaMalloc((void**) &d_in, sizeof(char) * len * nodec);
    cudaMalloc((void**) &d_out, sizeof(unsigned int) * nodec);
    cudaMemcpy(d_in, h_nodev, sizeof(char) * len * nodec, cudaMemcpyHostToDevice);
    int blocks = 1 + nodec / 512;
    cube<<<blocks, 512>>>(d_out, d_in, nodec, len);
    cudaMemcpy(h_out, d_out, sizeof(unsigned int) * nodec, cudaMemcpyDeviceToHost);
    cudaFree(d_in);
    cudaFree(d_out);
    return h_out;
}
Last bit of the Valgrind output:
...
==5727== 5,468 (5,020 direct, 448 indirect) bytes in 1 blocks are definitely lost in loss record 506 of 523
==5727==    at 0x402B965: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==5727==    by 0x4843910: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x48403E9: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x498B32D: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x494A6E4: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x4849534: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x48191DD: cuInit (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x406B4D6: ??? (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x406B61F: ??? (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x408695D: cudaMalloc (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x804A006: gethash (hashkernel.cu:36)
==5727==    by 0x804905F: chkisomorphs (bdd.c:326)
==5727== 
==5727== LEAK SUMMARY:
==5727==    definitely lost: 10,240 bytes in 6 blocks
==5727==    indirectly lost: 1,505 bytes in 54 blocks
==5727==      possibly lost: 7,972 bytes in 104 blocks
==5727==    still reachable: 626,997 bytes in 1,201 blocks
==5727==         suppressed: 0 bytes in 0 blocks
It's a known issue that valgrind reports false-positives for a bunch of CUDA stuff. The best way to avoid seeing it would be to use valgrind suppressions, which you can read all about here: http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress
If you want to jumpstart into something a little closer to your specific issue, an interesting post is this one on the Nvidia dev forums. It has a link to a sample suppression rule file. https://devtalk.nvidia.com/default/topic/404607/valgrind-3-4-suppressions-a-little-howto/
Try using cuda-memcheck --leak-check full. Cuda-memcheck is a set of tools that provides similar functionality to Valgrind for CUDA applications. It is installed as part of the CUDA toolkit. You can get more documentation about how to use cuda-memcheck here : http://docs.nvidia.com/cuda/cuda-memcheck/
Note that cuda-memcheck is not a direct replacement for valgrind and can't be used to detect host side memory leaks or buffer overflows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With