ERROR LIST VALUES

My error_list is showing the values " Nan "
Here is my error function:

def error(X,Y,theta):

err = 0.0
m = X.shape[0]
for i in range(m):
    hx = hypothesis(X[i],theta)
    try:
        err += Y[i]*(np.log2(float(hx))) + (1-Y[i])*(np.log2(float(1.0-hx)))
    except:
        err += 10
return -err

Here’s the hypothesis function :

def hypothesis(x,theta):

z = np.dot(x,theta)
return sigmoid(z)

def sigmoid(z):

gz = 1.0/(1.0 + np.exp(-1.0*z))
return gz

I added try and except block because it was giving an exception :

RuntimeWarning: divide by zero encountered in log2

RuntimeWarning: invalid value encountered in double_scalars

WHAT TO DO ??

The root of the problem is this line:

err += Y[i]*(np.log2(float(hx))) + (1-Y[i])*(np.log2(float(1.0-hx)))

log (0) is not defined or negative infinity. Try running np.log2(0.0) and you’ll get the same warning.
Whenever float(hx) gives something very close to 0, your error becomes NaN.

An easy fix is to add a very small value called epsilon in ML terminology (1e-10) to the term whose log is being taken.

err += Y[i]*(np.log2( float(hx) + 1e-10 )) + (1-Y[i])*(np.log2( float(1.0-hx) + 1e-10 ))

One thing more, when I plotted the curve for the same, I didn’t get a smooth curve in the beginning ( had 2 peaks and then descended ), but after that it reached a constant value.
Is that Ok ?

Yes, that is pretty standard behavior when training ML models. One thing you should note that the error should always decrease in the long run even when showing some volatility in the beginning. If the volatility (peaks) is large and frequent, then, maybe the learning rate is too high. Try decreasing it.
Hope this helps!

I tried but the graph first increases very drastically, then a pointed peak, then decreases , then a local maxima peak , then becomes a decreasing graph and approaches a constant value.
But the accuracy is 99%.

That is possible depending on the dataset and the model. Keep an eye on whether both training and validation error are decreasing simultaneously or not (After some point). Also try feature scaling if you haven’t already.

If you still think that these peaks are the result of an error and the learning process is not taking place, share your code and the plot with me. I’ll have a look.

Actually, the data set was created in the video by sir himself, by using multi-variate function in numpy.

The problem was learning rate related. Change the default learning rate to something in the range 1e-4 to 1e-3 and you’ll see much smoother plots.

Leave a like if your issue was resolved!

1e-3 just worked fine (only one peak ) but with decreased accuracy of 98%
for 1e-4 accuracy fell down to 93%
But one thing, how do we decide the learning rate ??

The thing is with a small learning rate, we are taking smaller steps towards the minima (solution). Hence, with a decreased learning rate, increase the number of epochs to get to the same accuracy (the param max_epochs). You need to train longer for convergence when you decrease the learning rate.

Regarding how to get the ideal learning rate, there is no straight-forward way. 1e-3 is pretty good value which works for most datasets. A good rule of thumb: If the training is slow, increase it by a factor of 10, if there are peaks (volatility), decrease by a factor of 10. Pretty much hit-and-trial.

Complex techniques to approximate the right learning rate do exist, but they are out-of-scope for now.

1 Like

Ok, got the point.
Thanks a lot !!

1 Like

I hope I’ve cleared your doubt. I ask you to please rate your experience here
Your feedback is very important. It helps us improve our platform and hence provide you
the learning experience you deserve.

On the off chance, you still have some questions or not find the answers satisfactory, you may reopen
the doubt.