Why create the class loguniform_int and not just use loguniform.rvs(a, b, size=n).astype(int)?

Hi,
What is the interest of creating the class loguniform_int compared to an iterable like loguniform.rvs(start, end, size=n).astype(int) ?

thanks for your answers

Basically, when we want to provide a distribution, the scikit-learn documentation of RandomizedSearchCV mention the following:

Distributions must provide a rvs method for sampling (such as those from
scipy.stats.distributions)

Your approach is equivalent to passing a list. In this case, scikit-learn will sample uniformly. So it will not be strictly a log uniform distribution. Also, you need to synchronize n and n_iter and rerun the code. It might be easier to just pass a distribution for this purpose.

I did not fully understand your answer.
I understood that you wanted to provide a uniform probability distribution to your parameters and since you need an rvs method you were “overloading” it to get integers.
But what RandomizedSearchCV does is to use the rvs method of loguniform to generate a random number from the distribution. So I don’t really see the difference between the result of a call n-iter time to loguniform_int(2,255) and [loguniform.rvs(2,255).astype(int)] or to do loguniform.rvs(2,255, size = n_iter).astype(int). Does the role of of rvs in loguniform not to help generate distributions?

When I tested to not use your class I did obtain slightly worse results so I suppose the use of your classe improve the model but I still wondering why.

In general do we have to prefer the use of distributions over the use of iterables as parameters for RandomizedSearchCV?

Actually, I made a mistake that sampling with replacement of the list generated with rvs will change the distribution. This is indeed equivalent.

It might only be random fluctuation. You probably pick up different parameters combination and we got lucky to pick a combination that work better than yours. This is also due that n_iter=20 is very small in comparison to the number of parameters tested.

In general yes. RandomizedSearchCV was designed to use distribution instead of iterable. However, it accommodates iterable because some parameters does not rely on numeric.

:+1:
thanks a lot