Random_state=42

Hello everyone !

It is not clear to me the choice of the number 42 when using train_test_split. It seems that 42 is a popular choice but is it meaningful ?

It’s the answer to the Ultimate Question of Life, the Universe, and Everything!

2 Likes

But in a more serious answer for those who are not familiar with random states (aka random seeds):

The train_test_split by default selects a random subset of the data for testing. Without setting the random_state, a different random subset will be selected each time you run the cell of code.

Setting the random_state parameter allows in general to get deterministic results when we use a random number generator. In this sense, the number “42” is an arbitrary number that you can provide to other people for them to reproduce your results.

@ArturoAmorQ thank you very much for your answer.

You have to read “The Hitchhiker’s Guide to the Galaxy” by Douglas Adams. And all the other related books.

1 Like