are we referring bias term as xo whose value is always 1 so why i have to calculate its gradient
i mean the implementation of linear regression and logistic regression algorithm will be exactly same except for the fact that our hypothesis function gets changed and error function also gets changed in case of logistic regression
so why i have to calculate gradient seperately for bias term i mean we can simply add a column of ones in our X dataset and find our optimal parameters theta