Hello everyone !
It is not clear to me the choice of the number 42 when using train_test_split. It seems that 42 is a popular choice but is it meaningful ?
Hello everyone !
It is not clear to me the choice of the number 42 when using train_test_split. It seems that 42 is a popular choice but is it meaningful ?
But in a more serious answer for those who are not familiar with random states (aka random seeds):
The train_test_split
by default selects a random subset of the data for testing. Without setting the random_state
, a different random subset will be selected each time you run the cell of code.
Setting the random_state
parameter allows in general to get deterministic results when we use a random number generator. In this sense, the number “42” is an arbitrary number that you can provide to other people for them to reproduce your results.
You have to read “The Hitchhiker’s Guide to the Galaxy” by Douglas Adams. And all the other related books.