A is a series of points coordinates in 3D (X,Y,Z), for instance:
>> A = [1 2 0;3 4 7;5 6 9;9 0 5;7 8 4]
A =
1 2 0
3 4 7
5 6 9
9 0 5
7 8 4
I would like to sort the matrix with respect to "Y" (second column) values.
Here is the code that I am using:
>> tic;[~, loc] = sort(A(:,2));
SortedA = A(loc,:)
toc;
SortedA =
9 0 5
1 2 0
3 4 7
5 6 9
7 8 4
Elapsed time is **0.001525** seconds.
However, it can be very slow for a large set of data. I would appreciate it if anyone knows a more efficient approach.
This answer would mainly talk about how one can harness a compute efficient GPU for solving the stated problem. The solution code to the stated problem presented in the question was -
[~, loc] = sort(A(:,2));
SortedA = A(loc,:);
There are essentially two parts to it -
Now, Part 1 is compute intensive, which could be ported onto GPU, but Part 2 being an indexing work, could be done on CPU itself.
So, considering all these, an efficient GPU solution would be -
gA = gpuArray(A(:,2)); %// Port only the second column of input matrix to GPU
[~, gloc] = sort(gA); %// compute sorted indices on GPU
SortedA = A(gather(gloc),:); %// get the sorted indices back to CPU with `gather`
%// and then use them to get sorted A
Presented next is the benchmark code to compare the GPU version against the original solution, however do keep in mind that since we are running the GPU codes on different hardware as compared to the originally stated solution that runs on CPU, the benchmark results might vary from system to system.
Here's the benchmark code -
N = 3000000; %// datasize (number of rows in input)
A = rand(N,3); %// generate random large input
disp('------------------ With original solution on CPU')
tic
[~, loc] = sort(A(:,2));
SortedA = A(loc,:);
toc, clear SortedA loc
disp('------------------ With proposed solution on GPU')
tic
gA = gpuArray(A(:,2));
[~, gloc] = sort(gA);
SortedA = A(gather(gloc),:);
toc
Here are the benchmark results -
------------------ With original solution on CPU
Elapsed time is 0.795616 seconds.
------------------ With proposed solution on GPU
Elapsed time is 0.465643 seconds.
So, if you have a decent enough GPU, it's high time to try out GPU for sorting related problem and more so with MATLAB providing such easy GPU porting solutions.
MATLAB Version: 8.3.0.532 (R2014a)
Operating System: Windows 7
RAM: 3GB
CPU Model: Intel® Pentium® Processor E5400 (2M Cache, 2.70 GHz)
GPU Model: GTX 750Ti 2GB
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With