Regression_notebook_code

def batch_gradient(X,Y,theta,batch_size=30):

m = Y.shape[0]
indices = np.arange(m)
np.random.shuffle(indices)
indices = indices[:batch_size]
grad = np.zeros((2,))
for i in indices:
    h = hypothesis(X[i],theta)
    grad[0] += (Y[i]-h)
    grad[1] += (Y[i] - h)*X[i]

return grad*0.5

What is the use of batch_gradient function?

it calculates and return gradient by which are parameters are to be updataed.

There are two separate function - gradient and batch_gradient. What’s the difference between the two?

def gradient(X,Y,theta):
    
    m = Y.shape[0]
    grad = np.zeros((2,))
    
    for i in range(m):
        h = hypothesis(X[i],theta)
        grad[0] += (Y[i]-h)
        grad[1] += (Y[i] - h)*X[i]
        
    
    return grad*.5

def batch_gradient(X,Y,theta,batch_size=30):
    
    m = Y.shape[0]
    indices = np.arange(m)
    np.random.shuffle(indices)
    indices = indices[:batch_size]
    grad = np.zeros((2,))
    for i in indices:
        h = hypothesis(X[i],theta)
        grad[0] += (Y[i]-h)
        grad[1] += (Y[i] - h)*X[i]
    
    return grad*0.5

in first one or normal gradient descent function we are iterating over all the rows or all training examples
but in batch gradient descent we are iterating over only certain part of our training examples instead of whole data(which is given by batch size).they are two types if gradient descent they will both actually find same local minima but batch gradient is slightly faster .
this will be covered more in class on linear regression.

Ok thanks. I got that.
I got a few more questions in linear regression.

  1. Why do we normalize data in linear regression? What will happen if we do not normalize our data?
  2. Also, are we standardizing data or normalizing data? I read somewhere:
    -Normalization transforms your data into a range between 0 and 1
    -Standardization transforms your data such that the resulting distribution has a mean of 0 and a standard
    deviation of 1
  3. Do we have to denormalize our results to make the algorithm work for our actual data? Won’t we get wrong results if we test our actual predictions of y using non-normalized values of x?

1.answer for normalization and standardization:

  1. we dont need to denormalize our data instead we normalize the input of test data with same scale
    that is input whose output is to be predicted is normalized first.

So we are standardizing data and not normalizing, right? In the video sir said we are normalizing it.

standardizing and normalizing are generally used interchangebly as they both are used for feature scaling.

Okay. Last question. What will happen if we do not standardize our data before performing a linear regression on it?

wait for this topic to be discusssed in class u will get ur answer with complete explanation.

1 Like