Pandas MNSIT Dataset/ image store 28X28?

at 10:26 why we use randomset=5

what is the need of this??

also explain about how image is 28X28??

i think image is stored as 3X3 matrix according to RBG values Please Explain in deep

Hey @anandprakash1091971, let me explain in detail to you :

random_state : int or RandomState instance, default=None
Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls.

random_state is a parameter which you can pass while calling the train_test_split method. If you don’t give this argument , it’s default value is None, what this exactly means is that everytime you run that cell of your notebook, the data will be shuffled differently in a random manner. But if you pass random_state = any integer, then everytime you run that cell , the data will be shuffled exactly in a similar random manner. So to maintain consistency we usually pass some random integer as the random_state. I hope this is clear …Cool ?

Now coming to your other doubt. We know that mnist is a black and white dataset right ?..So this means , the image will have only 1 channel and not R,G,B channels as you said. Now when you load the dataset you get a shape of (m, 784) , where m are the total no of training examples/images. Now if you observe 784 is the square of 28 and hence we reshape each image to be a square of 28X28 . so that we can display that image as a 2D image. Hence you would get a dataset of shape (m,28,28) where m is again the total no of examples and 28*28 are the dimensions of 1 single square black and white image. I hope this is clear too :slight_smile:

All the best :slight_smile:
Happy coding :slight_smile:

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.