Hello again,
I have 3 questions concerning this part of the MOOC.
1/ I did not understand the part in the lecture about GBDT with residual where it is said the second tree makes no error (perfect prediction) and predicts perfectly whereas there is still an error according to what is printed (Error of the tree: 0.118. But maybe it’s the whole tree and not the second subtree ? In this case, I still don’t see how we see that the second tree predictions is perfect and how it operates). I can’t figure out where the 0.264 comes from, how was it computed plz ?
2/ We say thant random forest can be speed because it can run on several cores in parallel but we also say that gradient boosting is very fast. In conclusion, which strategy is the fastest ?
On some hardware, can the parallel process be nearly as fast or even more as gradient boost ? Will quantum computing lead to an even faster random forest ? (well just joking for the last question ^^)
3/ It is said " The histogram gradient-boosting is the best algorithm in terms of score. It will also scale when the number of samples increases, while the normal gradient-boosting will not."
I don’t really understand the meaning of ‘scale’ here. what does it mean concretly that it will scale or it will not scale ?
Thanks a lot !
Geoffrey