in the image captioning model how sir have calculated numbers_pics_per_batch and hence steps, and why sir changing this steps number changes running time so much(it took harldly 1 min when taken number_of_pics_per_batch) very large ?
How sir have calculated steps
Hey @abubakar_nsit, number_pics_per_batch is a hyperparameter over here. The value of hyperparameters have to be taken experimentally like sir has taken number_pics_per_batch as 3 here. Now we want all of our training data to go through the model in 1 epoch. So how many steps will it take to pass the whole data through the model will be given by :
total size of data / number_pics_per_batch
. So if for example there are 90 images in our dataset, it would take 30 steps of batch size 3 to pass the whole data.
Note: Batch size and number_pics_per_batch are the same thing
I hope this helps you understand !
Happy Learning
sir then the time taken by the all the steps in one epochs should come almost same na?whether it is 3 steps per epoch or 2000 steps per epoch,since the system has to process all the images in both the cases?thn sir why there is huge difference between the both the cases?
This is because 2000 images going into the model at once will be faster than 3 images going into the model again and again .
For example, if we have a total of 6000 images in our training set and the batch size is 3, it would take 2000 steps to pass the whole dataset but if the batch size is 2000, it would take just 3 steps to pass the whole dataset. So ofcourse 3 steps will cause the single epoch to run faster than 2000 steps right ?
I hope this helps you understand !
Happy Learning
sir still overall computation performed by the pc will ramain almost same na?in both the cases the pc has to process 6000 images with size 2048 and for each image it has to create 35x50x5 data matrix then performs further computations on these, and if by dereasing steps per epoch this computations becomes faster by multpile times without lowering its performance why sir havent taken steps=1 and minimize the time of model training?
Hey @abubakar_nsit, when the batch size is 2000, then the model will forward propagate and then back propagate only 3 times in a single epoch, but when batch size is 3, the model has to forward and back propagate, 2000 times, also weights will be updated 2000 times in a single epoch. So that’s why there is a huge difference in the time taken to complete the epochs !
I hope you understand !
Happy Learning