About results of question 4

Hi,

I found the results of question 4 puzzling.

If I’m not mistaken, the previous notebooks on random forests told us that the general idea is to have very deep trees that overfit, and then consider many of them to balance out such side effect.

Now, it seems to me that the results of question 4 “disprove” such general understanding. Indeed, the RF that have a max_depth=5 has a better generalization performance than the one with no limited depth.

Am I missing something here?

Thanks.

The random forest with limited depth has a better generalization performance (test score) for a small number of trees but becomes equivalent for higher numbers of trees, i.e. the error bars of both test score curves overlap considerably and you cannot really say that the limited model is better.

The reason for the better performance in the region with a small number of trees (<10) is indeed that individual limited trees overfit less than the full grown trees and therefore generalize the best, as in this case there are not enough trees to balance such effect of overfitting, as you correctly said.

1 Like