Problem in logistic regression

abhaygarg2001 · May 31, 2020, 1:00pm

Please explain the code of data preparation in logistic regression. I am not able to understand it. TThe code is - https://ide.codingblocks.com/s/246762 . Please explain covariance matrix and np.random.multivariate_normal function also

prashant_ml · May 31, 2020, 5:34pm

hey @abhaygarg2001 ,

we need to generate some artificial data for training our logistic regression and for this we have use code ,that you have provided on link.

mean01= np.array([1,0.5])
cov01=np.array([[1,0.1],[0.1,1.2]])

mean02=np.array([4,5])
cov02=np.array([[1.21,0.1],[0.1,1.3]])

In these above line of codes we are initializing some parameters about our data to be created , mean and covariance respectively.

Here , mean array defines the major section/location of values nearby which are values are going to be generated . like for value mean value 1 , we can assume the values will be like in range 0.95 - 1.05 ,etc.
and covariance defines a metric/value/level which explains how the two variable are going alter/vary together.

Note : here you see , your means array is N-dimensional in shape and covariance is N x N dimensional .

dist_01=np.random.multivariate_normal(mean01,cov01,500)
dist_02=np.random.multivariate_normal(mean02,cov02,500)

and finally these codes are used to convert the previously generated means and covariance arrays to generate a normal distribution of values on which our model can be generated.

I hope this would have resolved your doubt.
Thank You.