My running config: - CUDA Toolkit 5.5 - NVidia Nsight Eclipse edition - Ubuntu 12.04 x64 - CUDA device is NVidia GeForce GTX 560: cc=20, sm=21 (as you can see I can use blocks up to 1024 threads)
I render my display on iGPU (Intel HD Graphics), so I can use Nsight debugger.
However I encountered some weird behaviour, when I set threads > 960.
Code:
#include <stdio.h>
#include <cuda_runtime.h>
__global__ void mytest() {
float a, b;
b = 1.0F;
a = b / 1.0F;
}
int main(void) {
// Error code to check return values for CUDA calls
cudaError_t err = cudaSuccess;
// Here I run my kernel
mytest<<<1, 961>>>();
err = cudaGetLastError();
if (err != cudaSuccess) {
fprintf(stderr, "error=%s\n", cudaGetErrorString(err));
exit (EXIT_FAILURE);
}
// Reset the device and exit
err = cudaDeviceReset();
if (err != cudaSuccess) {
fprintf(stderr, "Failed to deinitialize the device! error=%s\n",
cudaGetErrorString(err));
exit (EXIT_FAILURE);
}
printf("Done\n");
return 0;
}
And... it doesn't work. The problem is in the last line of code with float division. Every time I try to divide by float, my code compiles, but doesn't work. The output error at runtime is:
error=too many resources requested for launch
Here's what I get in debug, when I step it over:
warning: Cuda API error detected: cudaLaunch returned (0x7)
Build output using -Xptxas -v:
12:57:39 **** Incremental Build of configuration Debug for project block_size_test ****
make all
Building file: ../src/vectorAdd.cu
Invoking: NVCC Compiler
/usr/local/cuda-5.5/bin/nvcc -I"/usr/local/cuda-5.5/samples/0_Simple" -I"/usr/local/cuda-5.5/samples/common/inc" -G -g -O0 -m64 -keep -keep-dir /home/vitrums/cuda-workspace-trashcan -optf /home/vitrums/cuda-workspace/block_size_test/options.txt -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=sm_21 -odir "src" -M -o "src/vectorAdd.d" "../src/vectorAdd.cu"
/usr/local/cuda-5.5/bin/nvcc --compile -G -I"/usr/local/cuda-5.5/samples/0_Simple" -I"/usr/local/cuda-5.5/samples/common/inc" -O0 -g -gencode arch=compute_20,code=compute_20 -gencode arch=compute_20,code=sm_21 -keep -keep-dir /home/vitrums/cuda-workspace-trashcan -m64 -optf /home/vitrums/cuda-workspace/block_size_test/options.txt -x cu -o "src/vectorAdd.o" "../src/vectorAdd.cu"
../src/vectorAdd.cu(7): warning: variable "a" was set but never used
../src/vectorAdd.cu(7): warning: variable "a" was set but never used
ptxas info : 4 bytes gmem, 8 bytes cmem[14]
ptxas info : Function properties for _ZN4dim3C1Ejjj
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Compiling entry function '_Z6mytestv' for 'sm_21'
ptxas info : Function properties for _Z6mytestv
8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 34 registers, 8 bytes cumulative stack size, 32 bytes cmem[0]
ptxas info : Function properties for _ZN4dim3C2Ejjj
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
Finished building: ../src/vectorAdd.cu
Building target: block_size_test
Invoking: NVCC Linker
/usr/local/cuda-5.5/bin/nvcc --cudart static -m64 -link -o "block_size_test" ./src/vectorAdd.o
Finished building target: block_size_test
12:57:41 Build Finished (took 1s.659ms)
When I add -keep key, the compiler generates .cubin file, but I can't read it to find out the values of smem and reg, following this topic too-many-resources-requested-for-launch-how-to-find-out-what-resources-/. At least nowadays this file must have some different format.
Therefore I'm forced to use 256 threads per block, which is probably not a bad idea, considering this .xls: CUDA_Occupancy_calculator.
Anyway. Any help will be appreciated.
I filled the CUDA Occupancy calculator file with the current informations :
I got 0% occupancy, limited by registers count.
If you set the number of thread to 960, you have 63% occupancy, which explains why it works.
Try to limit the count of registers to 32 and set the numbers of threads to 1024 to have 67% occupancy.
To limit the count of registers, use the following option :
nvcc [...] --maxrregcount=32
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With