Using (held-out) test data for scoring of model with best parameters

What exactly is the reason for using a held out data_test for scoring the best model even after it was cross validated on the data_train (where data_train will be split into 5 different train-test combinations) by cross_val_score?

The short answer is that you are evaluating the generalization performance of the best parameters, which a priori is not the same as the score you found on the training set when tuning them. In fact, to tune the hyperparameters you use information from the validation sets, it is not unseen data, which is the spirit of estimating the generalizability.

In module 7 you will find a bit more details on this matter in the the notebook Nested cross-validation.