That’s also a possibility. Here we wanted to keep it simple by using a mapping where all string-valued features would be treated as nominal categorical variables (no a-priori assumed ordering or quantitative interpretation of the values) and all numerical encoded features would have a natural quantitative interpretation (e.g. adding values can have a meaning).
This choice is can be questioned and it’s perfectly fine to try either strategy or even not remove any feature at all and use the cross-validation score to decide which choice leads to the best predictive model.