As per solution, I understand that one model is always better but when running my code with RepeatedKFold (maybe not the right way at all ?)
the difference is less than 0.01 so it seems very similar. Can you clarify what is wrong in my logic ?
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedKFold
cv = RepeatedKFold(n_splits=10, n_repeats=10, random_state=1)
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
rf = RandomForestClassifier(n_estimators=300, n_jobs=2)
gbf = GradientBoostingClassifier(n_estimators=300)
scores_rf = cross_val_score(rf, data, target, scoring='balanced_accuracy', cv=cv, n_jobs=2)
print(scores_rf.mean())
scores_gbf = cross_val_score(gbf, data, target, scoring='balanced_accuracy', cv=cv, n_jobs=2)
print(scores_gbf.mean())