Applying image augmentation on training data

raunaqsingh10 · April 1, 2020, 6:21am

Suppose I have 800 images in my training data. Now I apply image augmentation. Image augmentation results in infinite images.

How many images should I now be training?

The number of images I train will depend upon steps per epoch?

S18CRX0120 · April 1, 2020, 7:55am

Hey @raunaqsingh10, you should be training your model till its convergence means till validation loss is decreasing. You need to keep track of it and need to prevent overfitting. Your training images will be 800, but every time your train generator is called, it will send the batch with images having little distortions, (as specified by you in the function) and nothing else. Also the number of images you will train will not depend upon steps per epoch, if steps for epoch is less than in next epoch it will take data from the point it left previous epoch. The unseen samples will be seen in the next epoch.

Hope this resolved your doubt.
Plz mark the doubt as resolved in my doubts section.

raunaqsingh10 · April 1, 2020, 8:59am

I could not understand the last part. If I take by batch size to be 80 and steps per epoch = 10, then for every epoch I’ll have 800 images to train on, but if I set steps per epoch = 15, then for every epoch I’ll have 1200 images to train on

S18CRX0120 · April 1, 2020, 11:05am

Hey @raunaqsingh10, Assuming that training data comprises of 800 images.

1 case, “If you take batch size to be 80 and steps per epoch = 10, then for every epoch I’ll have 800 images to train on” yes this statement is correct no explanation needed.

2 case “f I set steps per epoch = 15, then for every epoch I’ll have 1200 images to train on”. This is wrong you can’t specify values such that, steps_per_epoch * batch_size > training examples.

3 case, if steps_per_epoch = 5 that in a single epoch we will be using 400 images, but this doesn’t mean our model will use only the initial 400 images for training, In next epoch the reamaining 400 images will be used.

Hope this resolved your doubt.
Plz mark the doubt as resolved in my doubts section.

raunaqsingh10 · April 1, 2020, 12:50pm

When I am using ImageDataGenerator what is exactly happening to my 800 images. I’m getting 800(only?) transformed images? Right? " 2 case “f I set steps per epoch = 15, then for every epoch I’ll have 1200 images to train on”. This is wrong you can’t specify values such that, steps_per_epoch * batch_size > training examples." I’ve tried and we “can” actually set value of steps per epoch such that steps_per_epoch * batch_size > training examples. What I want to understand now is will repetition of data will take place or new transformed images apart from the 800 image will be formed?

S18CRX0120 · April 1, 2020, 2:17pm

Hey @raunaqsingh10, image_train_generator, actually gives the same images from train data but every time it gives the image with any of the distortion or any other augmentation technique specified in the image. Means it has those 800 images only, but whenever we ask for say n images, than it will take n images sequentially from the data, apply some augmentation technique individually to image and than returns it.

S18CRX0120 · April 5, 2020, 8:00pm

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.

D17LP0007 · April 30, 2020, 9:25pm

Hi @S18CRX0120 ! I also have the same doubt.
As you are mentioning in case 2 that steps_per_epoch * batch_size > training examples should not hold true,but in the video “Training Using fit_generator,Visualizing results”,validation_steps are taken to be 4,batch_size=32 and that gives their product as 128>84(#Images in the validation_set) and the model is running,how come this is true then?

D17LP0007 · May 1, 2020, 1:59pm

please reply @S18CRX0120