Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the data downloading is much slower than the uploading on GPU by using OpenCL?

I'm a beginner of OpenCL for image processing, I use Win7+VS2010+OpenCL2.0+OpenCV247. The platform in my PC is intel i7 CPU + NvidIA GTX760.

Here is my work:

  1. I used opencv to read image(1920*1080) from video, then copy image data and get the data pointer.

    uchar* input_data=(uchar*)(gray_image->imageData);
    
  2. Then I want do some convolution and other image processing works on GPU, so I used OpenCL to upload this data(input_data) to the device memory(cl_input_data) which has been created before. The uploading step takes about 0.2ms, it is fast.

    clEnqueueWriteBuffer(queue, cl_input_data, 1,
        0, ROI_size*sizeof(cl_uchar), (void*)input_data, 0, 0, NULL);
    
  3. The main processing works on several kernels, and each of them takes less than 0.1ms which are all quite normal.

    clEnqueueNDRangeKernel( queue,kernel_box,2,NULL,global_work_size,local_work_size, 0,NULL, NULL);
    
  4. After all the processing, I want to download the GPU memory(cl_output_data) to host(output_data), and this step it takes over 5.5ms! Which is nearly 27 times slower than the data uploading step!

    clEnqueueReadBuffer( queue,cl_output_data,CL_TRUE,0,ROI_size * sizeof(char),(void*) output_data,0, NULL, NULL );
    

So, I'm just wondering, since I used the same device and the data size was exactly the same, why the uploading and downloading data's time is so different?

Oh, by the way, the time testing tool I used is something like QueryPerformanceFrequency(&m_Frequency);

Thank you!

like image 593
David Ding Avatar asked Nov 30 '25 19:11

David Ding


1 Answers

As I remember, clEnqueueNDRangeKernel is asynchronous call. It will return control without synchronization with device. So, when you measure time of clEnqueueNDRangeKernel, it is just a time of launch, not of processing. clEnqueueReadBuffer forces device synchronization and waits until all previous kernel call will finish. Thus, your 5.5 ms includes kernels execution time.

like image 137
jet47 Avatar answered Dec 02 '25 08:12

jet47



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!