I am writing a mixed cpu-gpu program that require multiple cpu threads to access multiple gpus. Is CUDA stream thread-safe? Specifically, I wonder if the following is correct:
// two threads concurrently enter cuda device 1 and
// launch kernel on the same stream
std::thread t1([&](){
cudaSetDevice(1);
cudaEventRecord(begin_t1, stream);
kernel<<<mygrid, myblock, 0, stream>>>(...);
cudaEventRecord(end_t1, stream);
});
std::thread t2([&](){
cudaSetDevice(1);
cudaEventRecord(begin_t2, stream);
kernel<<<mygrid, myblock, 0, stream>>>(...);
cudaEventRecord(end_t2, stream);
});
It is legal for multiple host threads to access and use the same stream.
However, there is nothing in CUDA that guarantees the order of operations with respect to different threads. Therefore, with respect to the stream in question here, this sequence is possible:
begin_t1,kernel,end_t1,begin_t2,kernel,end_t2
but this is also possible:
begin_t1,begin_t2,kernel,kernel,end_t1,end_t2
If you want to enforce ordering between threads, you will need to do that using mechanisms provided by the threading system you are using.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With