How splitting around specific feature (which maximizes Information Gain ) helps us in predicting something?
Decision Tree And Random Forest
hey @Bhawna ,
choosing a feature with maximum information means , at a particular depth or point in the algorithm , which feature is mostly related , or learn maximum information from given dataset .
You can choose any feature which one you like or at random also , but they might not much related to target variable and hence performs poor if taken into consideration.
That’s is the only reason we take features in a way such that we are training our model to learn more and more about the data ,without getting overfitted.
I hope this helps you understand it .
Thank You and Happy Learning