Please explain the code of data preparation in logistic regression. I am not able to understand it. TThe code is - https://ide.codingblocks.com/s/246762 . Please explain covariance matrix and np.random.multivariate_normal function also
Problem in logistic regression
hey @abhaygarg2001 ,
we need to generate some artificial data for training our logistic regression and for this we have use code ,that you have provided on link.
mean01= np.array([1,0.5])
cov01=np.array([[1,0.1],[0.1,1.2]])mean02=np.array([4,5])
cov02=np.array([[1.21,0.1],[0.1,1.3]])
In these above line of codes we are initializing some parameters about our data to be created , mean and covariance respectively.
Here , mean array defines the major section/location of values nearby which are values are going to be generated . like for value mean value 1 , we can assume the values will be like in range 0.95 - 1.05 ,etc.
and covariance defines a metric/value/level which explains how the two variable are going alter/vary together.
Note : here you see , your means array is N-dimensional in shape and covariance is N x N dimensional .
dist_01=np.random.multivariate_normal(mean01,cov01,500)
dist_02=np.random.multivariate_normal(mean02,cov02,500)
and finally these codes are used to convert the previously generated means and covariance arrays to generate a normal distribution of values on which our model can be generated.
I hope this would have resolved your doubt.
Thank You. 