Sir my doubt is as we have split our data into left and right
So we have to use data_left/right.survived.mean() ? Sir, please elaborate this why we are doing x_train.survived.mean()
as this gives us the mean of survived data that is in x_train but we need a mean of survived data which is in data_ left and data_right.
hey @Royal_Yashasvi,
You might know about binary tree. So , decision tree is a binary tree and hence we need to split the data in such a way that we have two such nodes.
Hence , we take mean for that purpose , its not necessary that we need to use mean only , we can define any other formulae too , but we just need to have some method to divide.