Alexnet architecture

Bhawna · January 21, 2020, 2:15am

I didn’t understand DATA AUGMENTATION in given paper.
plz give me brief idea about how is it done.

Bhawna · January 21, 2020, 2:16am

chiraggandhi70726 · January 21, 2020, 11:14am

Data Augmentation:

Data Augmentation is usually done when we have less data than required to train and generalize a model. In order to make the most of our few training examples, we will “augment” them via a number of random transformations, so that our model would never see twice the exact same picture. This helps prevent overfitting and helps the model generalize better.

Coming on to your question two types of data augmentation techniques were used in Alexnet:

The first form of data augmentation consists of Image translations and horizontal reflections, this can be achieved by using keras.preprocessing.image.ImageDataGenerator

This class can be used to perform horizontal and vertical flips. I’d posted the code in the
Coding Blocks IDE for the same.

The output of the above code:

Plot-of-Augmented-Images-with-a-Horizontal-Flip1280×960 319 KB
The second form of data Augmentation performed was random changes to the light-level or brightness of the images.

Code for RandomBrightness Augmentation is IDE

Output: Image

I hope this makes everything clear about data augmentaion in AlexNet.

Bhawna · January 21, 2020, 12:30pm

in paper it is written "training set size is increased by a factor of 2048 " how?

chiraggandhi70726 · January 21, 2020, 1:37pm

They are actually training on 1.2 million * 2048 training images.

For each training image of size 256x256, if you extract patches of size 224x224, you can get up to 1024. 224x224 patches from the image ((256-224)*(256-224)). And for each such patch you take a horizontal reflection. In total 2048 patches from a single image.

Bhawna · January 21, 2020, 2:02pm

I understood the images(in 2nd type) ,are formed by changing brightness but what is meant by horizontal reflection and patches here i didn’t get

chiraggandhi70726 · January 21, 2020, 9:24pm

Here In the first form of Augmentation you’ll be going to create patches of (224 x 224) size Image.
Patch is basically a crop of size (224 x 224) from the original image of size (256 x 256).

From a single image of size (256 x 256) you can create
256-224 = 32 -- along x axis
256-224 = 32 -- along y axis

32*32=1024 patches.

For each patch you will also create a horizontally flipped image too. Therefore for 1 image there will be 2*1024 = 2048 patches.

Conclusion :

Patch is the Image crop from original Image.
For each unique Image there will be 2048 patches.
Horizontal reflection means flipping an image horizontally.

Bhawna · January 22, 2020, 2:12am

If patch is crop image ,then it should be of size (224,224),then number of such patches should be(256-224+1)*(256-224+1) ?

chiraggandhi70726 · January 23, 2020, 10:56am

yes, you are absolutely right infact In their paper they had said they used random crops so if they left out one or two crops won’t affect the dataset much.