For Weka Explorer (GUI), when we do a 10-fold CV for any given ARFF file, then what Weka Explorer provides (as far as I can see) is the average result for all the 10 folds.
Q. Is there any way to get the results of each fold? For instance, I need the error rates (incorrectly identified instances) for each fold.
Help appreciated.
With 10-fold cross-validation, Weka invokes the learning algorithm 11 times, once for each fold of the cross-validation and then a final time on the entire dataset. A practical rule of thumb is that if you've got lots of data you can use a percentage split, and evaluate it just once.
10-fold cross validation would perform the fitting procedure a total of ten times, with each fit being performed on a training set consisting of 90% of the total training set selected at random, with the remaining 10% used as a hold out set for validation.
When performing cross-validation, it is common to use 10 folds.
Pick a number of folds – k. Usually, k is 5 or 10 but you can choose any number which is less than the dataset's length. Split the dataset into k equal (if possible) parts (they are called folds) Choose k – 1 folds as the training set.
I think this is possible using Weka's GUI. You need to use the Experimenter though instead of the Explorer. Here are the steps:
Experimenter from the GUI Chooser
New button @ top-right)Results Destination to save the results toNumber of (cross-validation) folds to your liking (start experimenting with 2 folds for easy results)Number of repetitions (I recommend 1 to start of with)Run tab and Start the experiment and wait till it finishesAnalyse tab and import the experiment results by clicking Experiment (top-right)
Row select: Fold
Column select: Percent_incorrect or Number_incorrect (or any other measure you want to see)Weka Explorer does not have an option to give the results for individual folds when using the crossvalidation option, there are some workarounds. If you explicitly don't want to change any code, you need to do some manual fiddling, but I think this gives more or less what you want
Cross-validation, select Percentage split and set it to 90%More options... and change the Random seed for XVal / % Split value to something you haven't used before.This is not exactly equivalent to 10-fold crossvalidation though, since the pseudo-folds you make this way might overlap.
An alternative that is equivalent to crossvalidation, but more cumbersome, would be to make 10 folds manually by using the unsupervised instance filter RemoveFolds or RemoveRange.
Generate and save 10 training sets and 10 test sets. Then for every fold, load the training set, select Supplied test set in the classify tab, and select the appropriate test fold.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With