Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.
With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.
The goal is to predict the Sales Price for the test dataset with the given features.
- Introduction
- Read Data
- Explore Data a. Data size b. Strtucture of Data
- Exploratory Data Analysis & Visualization a. Check Missing Values b. Correlation between missing values and Sales Price c. Check numerical varibale d. Temporal variables and correlation with Sales Price e. Discrete variables and correlation with Sales Price f. Continuous variables, skeweness and outliers g. Categorial variables and correlation with Sales Price
- Missing value Imputation
- Handle Rare Categorial Features
- Label Encoding
- Scaling
- Train models a. Lasso Regression b. Elastic Net Regression c. Kernel Ridge Regression d. Support Vector Regression e. Gradient Boosting Regression f. XGBoost Regression g. Light GBM Regression h. Random Forest Regression
- Stack models
- Visualize model scores