Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a data into k-folds NOT randomly in matlab?

I have a dataset, for simplicity let's say it has 1000 samples (each is a vector).

I want to split my data for cross validation, for train and test, NOT randomly1, so for example if I want 4-fold cross validation, I should get:

fold1: train = 1:250; test= 251:1000
fold2: train = 251:500, test = [1:250 ; 501:1000]
fold3: train = 501:750, test = [1:500; 751:1000]
fold4: train = 751:1000, test = 1:750

I am aware of CVPARTITION, but AFAIK - it splits the data randomly - which is not what I need.

I guess I can write the code for it, but I figured there is probably a function I could use.


(1) The data is already shuffled and I need to be able to easily reproduce the experiments.

like image 379
amit Avatar asked Oct 18 '25 14:10

amit


1 Answers

Here is a function that does it in general:

function [test, train] = kfolds(data, k)

  n = size(data,1);

  test{k,1} = [];
  train{k,1} = [];

  chunk = floor(n/k);

  test{1} = data(1:chunk,:);
  train{1} = data(chunk+1:end,:);

  for f = 2:k
      test{f} = data((f-1)*chunk+1:(f)*chunk,:);
      train{f} = [data(1:(f-1)*chunk,:); data(f*chunk+1:end, :)];
  end
end

It's not an elegant 1 liner, but it's fairly robust, doesn't need k to be a factor of your number of samples, works on a 2D matrix and outputs the actual sets rather than indices.

like image 195
Dan Avatar answered Oct 21 '25 04:10

Dan