Leave-p-out | STA6257 Cross Validation

2.5 Leave-P-Out Cross-Validation

Leave-p-out cross-validation (LpOCV) is a method in which p number of data points are taken out from the total number of data samples represented by n. The model is trained on n-p data points and later tested on p data points. The same process is repeated for all possible combinations of p from the original sample. Finally, the results of each iteration are averaged to attain the cross-validation accuracy. (Tang, 2008) Unfortunately, this approach can make the validating process very time consuming in larger data sets. It may also not be random enough to get a true picture of the model’s efficiency. (Baron, 2021)

Diagram

Description automatically generated

Image from medium.com

2.6 Monte Carlo Cross-Validation

Monte Carlo cross-validation creates multiple random splits of the data into training and testing sets. For each split, the model is fit to the training data, and predictive accuracy is assessed using the testing data. The results are then averaged over the splits. The disadvantage of this method is that some observations may never be selected in the testing subsample, whereas others may overlap, i.e., be selected more than once. (Lever 2016)

Diagram

Description automatically generated

Image from medium.com