Cross-Validation

Daniel Bohorquez, Jason Heiserman, Mary Tarabocchia

November 2022

What is Cross-Validation?

Overfitting and Underfitting

Overfitting occurs when the model finds patterns in the training data that are caused by random chance.

Underfitting occurs when the model cannot “learn” the underlying trend of the data

KFold Cross-Validation

How does it work?

KFold Cross-Validation (continued)

Pros and Cons

Holdout Cross-Validation

What is it? How does it work?

Leave-One-Out Cross-Validation

How does it work?

Leave-One-Out Cross-Validation (continued)

Pros and Cons

Other CV Methods

Stratified KFold Cross-Validation

Leave-P-Out Cross-Validation

Other CV Methods (continued)

Monte Carlo Cross-Validation

Example of KFold - R Code

## This sets the cross-validation method with k=5 folds
method <- trainControl(method = "cv", number = 5)

## Fit regression model and use k-fold CV to evaluate performance
crossmodelfull <- train(as.factor(htn) ~ age + bmi + ecghr,
                        data = crossdata,
                        method = "glm",
                        trControl = method)

Example of KFold - Results

Conclusion

Questions