This one is a bit tricky. Choosing to treat “years” as categorical and numerical could come back to a modelling choice.
Here, I would say that “year” can be considered numerical the same way “temperature” could be. Temperature is bounded, at least what we can measure. What makes it for sure numerical is that we expect it to be a floating number but we could have integral measurement as well, and we would still consider it numerical.
For “years” we can consider that this is a measure of time. What we measure is usually bounded and we rounded. So this is a bit similar to temperature. But as I said, when it comes to modelling it with a predictive model we can potentially consider to model as a numerical value or a categorical value. The numerical value would be linked to a measure of time while the categorical approach would not represent such information.
You should probably define what you mean by efficient. I will suppose that you mean efficient in terms of generalization score.
Adding new features will induce the model to be more flexible. Indeed, it gets more information to “create new rules”. However, if you start to have too many features, the model will have too much flexibility and will “create rules” for noisy data points. This is what we call overfitting.
Those aspects are discussed in the second chapter.