If the given training sets contain a string column then what are the changes we can apply in the linear regression
algorithm?
Linear regression multiple features
It really depends on the type of string column given in the training sets. For example if the column represents the Gender then it would contain only 2 strings, either ‘male’ or ‘female’. So columns like these can be converted into numeric values like 0 denoting ‘male’ and 1 denoting ‘female’.
Else if the column contains string values like the name of people, then such columns can be dropped(removed) from the dataset as they may not be contributing in the prediction of the output variable.
In this problem they have given a string column in test cases as well, then how should i solve this problem?
Which column are you talking about ? I guess all the columns have numeric values.
The first column gift-id. We need to match all the gift_id column values in the submitted file with the values in test.csv provided. So how to do this?
See don’t use gift_id as a feature for predicting the price. It is just like a serial number. What you can do is read the test.csv file and pass all other features of a row except the first column to your model. Then make the predictions and add them(as a new column in the end) in the test.csv file. They will be in the same order as you want.
Okay… thankyou so much!!