Can cudaMemcpy be used for memory allocated with cudaMallocPitch? If not, can you tell, which function should be used. cudaMallocPitch returns linear memory, so I suppose that cudaMemcpy should be used.
You certainly could use cudaMemcpy to copy pitched device memory, but it would be more usual to use cudaMemcpy2D. An example of a pitched copy from host to device would look something like this:
#include "cuda.h"
#include <assert.h>
typedef float real;
int main(void)
{
    cudaFree(0); // Establish context
    // Host array dimensions
    const size_t dx = 300, dy = 300; 
    // For the CUDA API width and pitch are specified in bytes
    size_t width = dx * sizeof(real), height = dy;
    // Host array allocation
    real * host = new real[dx * dy];
    size_t pitch1 = dx * sizeof(real);
    // Device array allocation
    // pitch is determined by the API call
    real * device;
    size_t pitch2;
    assert( cudaMallocPitch((real **)&device, &pitch2, width, height) == cudaSuccess );
    // Sample memory copy - note source and destination pitches can be different
    assert( cudaMemcpy2D(device, pitch2, host, pitch1, width, height, cudaMemcpyHostToDevice) == cudaSuccess );
    // Destroy context
    assert( cudaDeviceReset() == cudaSuccess );
    return 0;
}
(note: untested, cavaet emptor and all that.....)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With