How to Improve Accuarcy

mananaroramail · August 9, 2020, 11:42am

Using Keras framework, I was able to achieve 83% accuracy.

How to improve the model.

Also I took 1800 out of the 2250 examples for training and the rest for validation.

Lastly, how to think about the suitable number of layers in the MLP architecture and also how many neurons in each layer should be considered.
Is there any solid approach or do we just train different models and just select the best one ?

prashant_ml · August 10, 2020, 10:58am

hey @mananaroramail ,
its extremely good to hear that you reached such a good accuracy.

Currently You are getting with particular test dataset which you have created from your model from your overall training data.
But , you might be knowing that , if you just change this data split from 80:20 to 86:14 or whatever your model training changes. Which means , although you might have got 80%+ score on validation , but it doesnot confirms how your model is going to behave on new data .
So to just get a better understand the predictive power of our model we use KFold. Search about it a bit , if you can 't get it , i will explain you.

This is hyper parameter tuning task , where your number of neurons ,layers are all nothing but hyper parameters. So , for keras there are some libraries available that help you do this task , although you can create custom loops to train your model large number of times on a domain of such parameters and find a record of such parameter combination , which is working best for your model.

Different tuning libraries performs this custom loops only , but the way they choose parameters , optimize them , is the actual work they are needed for.

I hope this helped you.
Thank You .

mananaroramail · August 10, 2020, 2:59pm

According to what I understood about KFold, suppose k=5,
the data will be splitted into 5 portions [ say a, b, c, d, e ]
Firstly the model will be trained on [a,b,c,d] and [e] will be used for validation.
Then the model will be trained on say [a,b,c,e] and [d] will be used for validation and so on.

Then I wrote this code:

from sklearn.model_selection import KFold
kf = KFold(n_splits=5, shuffle=True, random_state=1)

This line will return the indices:
a,b = kf.split(X_train)

But it gives the error :
ValueError Traceback (most recent call last)
in ()
1 # Using K Fold technique
----> 2 a,b = kf.split(X_train)

ValueError: too many values to unpack (expected 2)

How to tackle this ?

Also what’s the difference b/w KFold and Cross validation score ?

prashant_ml · August 10, 2020, 3:05pm

change this to …a,b = kf.split(X_train,Y_train)

They are the same thing. But with KFold , you can make it a bit customized cross validation.
Although they are the same thing.

mananaroramail · August 10, 2020, 6:46pm

Still giving the same error !

One thing more, this kf.split() method returns the indexes ??

Edit : I tried running the same through a for loop and it’s working fine, like this:

for a,b in kf.split(X_train):
print(X_train[a], X_train[b])

Also, can you tell me how to proceed further, like shall I just change the number of epochs and track the average accuracy ?

Also, how to perform this and which library to refer ?

prashant_ml · August 11, 2020, 6:33am

hey @mananaroramail,

refer here https://towardsdatascience.com/hyperparameter-optimization-with-keras-b82e6364ca53.

Increasing just the number of epochs isn’t a good choice. You need to understand the behavior of your model with a particular number of layers and there nodes. As it is all linked to each other , so you need to try various experiments. Which layer to choose after a particular layer and which to doesn’t .
This is all a part of experiments.
Generally , we take number of nodes in power of 2. But you can try another too. Use callbacks to get the correct number of epochs for better performance .

I hope this helps you .