In the previous analysis, I conducted 5 fold cross validation with R2 scores ranging from -0.0598 to 0.4617.
I conducted 10 fold cross validation to check how my model is going to perform, whether the model efficiency is going to increase or decrease. Results are as follows:-
5-Fold Cross-Validation:
- R2 values: [0.462, 0.020, -0.060, -0.059, 0.411]
- Mean Absolute Error (MAE) values: [-0.588, -0.509, -0.460, -0.164, -0.368]
- Root Mean Squared Error (RMSE) values: [0.773, 0.706, 0.610, 0.234, 0.593]
10-Fold Cross-Validation:
- R2values: [0.402, 0.348, 0.338, -0.337, -0.077, 0.268, 0.024, -0.118, 0.060, 0.423]
- Mean Absolute Error (MAE) values: [-0.566, -0.567, -0.417, -0.617, -0.621, -0.259, -0.150, -0.177, -0.159, -0.587]
- Root Mean Squared Error (RMSE) values: [0.766, 0.739, 0.553, 0.848, 0.761, 0.344, 0.198, 0.266, 0.235, 0.809]
My interpretation for 5 and 10 fold cross validation are as follows:-
- (Coefficient of Determination): It’s a measure of how well the variations in the predicted values are explained by the model. A greater R2 is generally better. In our results, both 5-fold and 10-fold cross-validation have some negative R2 values, which indicates that the models could be worse. The 10-fold seems to have slightly more consistent R2 values, but it’s essential to ensure that the model doesn’t overfit
- Mean Absolute Error (MAE): It measures the average of the absolute differences between the predicted and actual values. A lower MAE indicates better model performance. The MAEs from 10-fold CV are slightly more consistent than 5-fold.
- Root Mean Squared Error (RMSE): It measures the square root of the average of the squared differences between the predicted and actual values. A lower RMSE indicates better model performance. The RMSE values from 10-fold CV are relatively consistent.
I believe the 10-fold CV provides more consistent results in terms of R2, MAE, and RMSE. In addition the choice between 5-fold and 10-fold (or any other k) is often based on specific project needs, dataset size.