Facing Problem in housing price prediction challenge

abhaygarg2001 · September 8, 2020, 2:09pm

In this challenge, we have been given 81 columns in dataset. Also, some columns do not have numerical data. I initially thought of label encoding all the non-integral columns and fill the the values with mean of that column where NaN is written. Also then apply feature selection to find the best features and then use MLPs to predict the data(as this challenge is given in MLPs section). However, I am facing the issue that in some columns, like street and address related, is it good to use label encoding. Please tell whether this is a good approach or should I adopt another approach.

prashant_ml · September 10, 2020, 11:35am

hey @abhaygarg2001 ,
Any Approach is just an experiment , you can try anything and the one that works well and provide good results , just move forward with that.
Coming to Street and Address , yeah its not correct to label encode them , other than you can try extracting some particular region or city from these text data and use that in training.
Just try as much as you can , as this will improve your concepts and solving power.

I hope this helps you .