In the House price prediction challenge, We are given 80 columns of tabular data consisting of both number and text columns. Some columns also compromise of missing data.
-
How to convert columns consisting of text data into numbers? Do we implement OneHotEncoding on them?
-
How to take care of missing data. I’ve read that in the case of number columns, we fill the empty cell by mean of numbers available in that column. Is this right way?
Also How to deal with missing data in case of text columns?
- Do normalization is necessary for Neural network datasets? Do we need to normalize entire dataset once converted into numbers?