logisticRegression implementation

def hypothesis(X,theta):
    hx=np.dot(X,theta);
    return sigmoid(hx);

def sigmoid(hx):
    return 1.0/(1.0+np.exp(-1*hx));

def Gradient(X,theta,y):
    m=X.shape[0]
    hx=hypothesis(X,theta);
    grad=np.dot(X.T,(hx-y))
    return grad/m;

def error(X,theta,y):
    m,n=X.shape;
    hx=hypothesis(X,theta);
    err=0;
    for i in range(m):
        err+=y[i]*np.log2(hx[i])+(1-y[i])*np.log2(1-hx[i]);
    return -err/m;
    
def GradientDescent(X,y,iteration):
    alpha=0.1;
    m,n=X.shape;
    theta=np.zeros((n,1));
    err=[];
    for i in range(iteration):
        e=error(X,theta,y);
        stepsize=alpha*Gradient(X,theta,y);
        theta=theta-stepsize;
        err.append(e);
    
    return theta,err;
m,n=X_train.shape
ones=np.ones((m,1));
X_train=np.hstack((ones,X_train));
iteration=300;
theta,error=GradientDescent(X_train,Y_train,iteration);
# plt.plot(error);
error

this is taking much time to execute

please help where i am wrong??

also not plotting the error

Data Preperation

mean_01 = np.array([1,0.5])
cov_01 = np.array([[1,0.1],[0.1,1.2]])

mean_02 = np.array([4,5])
cov_02 = np.array([[1.21,0.1],[0.1,1.3]])


# Normal Distribution
dist_01 = np.random.multivariate_normal(mean_01,cov_01,500)
dist_02 = np.random.multivariate_normal(mean_02,cov_02,500)

print(dist_01.shape)
print(dist_02.shape)

data = np.zeros((1000,3))
print(data.shape)

# set first 500 rows to dist_01
data[:500,:2] = dist_01

# set rows 500 onwards to dist_02
data[500:,:2] = dist_02

# set output of first 500 nos to 1 rest are automatically 0
data[500:,-1] = 1.0

# now shuffle the data
np.random.shuffle(data)

split = int(0.8*data.shape[0])

X_train = data[:split,:-1]
X_test = data[split:,:-1]

Y_train = data[:split,-1]
Y_test  = data[split:,-1]

print(X_train.shape,X_test.shape)
print(Y_train.shape,Y_test.shape)

also tell me how can i share my jupyter notebook for asking doubt

Hey @anandprakash1091971, can you please upload your notebook on google drive and share the link of your drive here ? It will be easy for me to debug the error.

Thanks :slight_smile:

https://drive.google.com/file/d/11mHlMNGNAUWyeACg3NbeGZUtYO8muyqu/view?usp=sharing

Hey @anandprakash1091971, idk why but this link is not opening. Please kindly reupload and make sure the link sharing is on.

sharing is on
i have check it using another gmail ide before sending
it is opening
please you check again

Hey @anandprakash1091971, I checked your code in detail. There are no major errors in your code. I am attaching a vectorized code of Logistic regression for your reference. You can check the dimensions of the arrays you have created like theta or any other. Else I don’t see any conceptual error. Also try plotting the error.

Hope this helps !
Happy Learning :slight_smile:

thanks for help
it is now working after one change which is

changing
theta=np.zeros((n,1));
to
theta=np.zeros((n,));

but did not understand this

  1. are they different??
  2. what is the difference??
  3. how it will effect the ans

Hey @anandprakash1091971, I would be happy to help !..Can you please create a new thread for your doubt ?

Thanks and Happy Learning :slight_smile:

Hey @anandprakash1091971,

a = np.ones((10,))
b = np.ones((10,1))
The difference is that, a is a one dimensional array. Like this:

[1,1,1]  
And b is a multidimensinal array. Like this : 
[[1],
 [1],
 [1]]

So np.ones((10,)) creates a one-dimensional array of size 10, whereas np.ones((10,1)) creates a two-dimensional array of dimension 10×1. This is directly analogous to, say, the difference between a single number and a one-dimensional array of length 1 .

You have to multiply the learning rate with a one dimensional array theta (array containing parameters) instead of a two dimensional array theta !

Hope this explanation helps you clear the doubt :slight_smile:

1 Like

but i write the logic assuming theta is nX1 matrix
so theta=np.zeros((n,1)) it should give right ans

at which part it will create error in below code

def hypothesis(X,theta):
    hx=np.dot(X,theta);
    return sigmoid(hx);

def sigmoid(hx):
    return 1.0/(1.0+np.exp(-1*hx));

def Gradient(X,theta,y):
    m=X.shape[0]
    hx=hypothesis(X,theta);
    grad=np.dot(X.T,(hx-y))
    return grad/m;

def error(X,theta,y):
    m,n=X.shape;
    hx=hypothesis(X,theta);
    err=0;
    for i in range(m):
        err+=y[i]*np.log2(hx[i])+(1-y[i])*np.log2(1-hx[i]);
    return -err/m;
    
def GradientDescent(X,y,iteration):
    alpha=0.3;
    m,n=X.shape;
    theta=np.zeros((n,1));
    err=[];
    for i in range(iteration):
        e=error(X,theta,y);
        stepsize=alpha*Gradient(X,theta,y);
        theta=theta-stepsize;
        err.append(e);
    
    return theta,err;
    

What is the final loss value you are getting after using 1 dimensional array ?

def error(X,theta,y):
    m,n=X.shape;
    hx=hypothesis(X,theta);
    err=0;
    for i in range(m):
        err+=y[i]*np.log2(hx[i])+(1-y[i])*np.log2(1-hx[i]);
    return -err/m;

if theta is (n+1)X1
hx will be mX1
we are iterating on each value of mX1
err should be scaler which we want

am i wrong??

No you are right here. But what would happen in case of Gradient(X,theta,y) ? In that function you can see hx will become mX1 and your y is still one dimensional. So there it could create problems right ? Ofcourse you cannot subtract a one dimensional array from a two dimensional array. Hence we take our theta to be one dimensional instead of 2 dimensions !

Hope this clears your doubt !
Happy Learning :slight_smile:

1 Like