Binary Classification
Classic ML
Tabular
Notebook
The task is to predict churn of telecom company customers.
Implementation includes:
- Exploratory Data Analysis
- Missing Values Imputation
- Categorical Columns Encoding
- Data Normalization
- Modeling:
- Baseline: SimpleImputer + StandardScaler + OneHotEncoder + Logistic regression + GridSearchCV
- Final: CatBoostClassifier + Hyperparameter Search (Optuna)
Regression
DL
NLP
Notebook
The task is to predict salary based on the different text and categorical features.
Implementation includes:
- Exploratory Data Analysis
- Categorical Columns Encoding
- Target transformation
- Modeling:
- Baseline: Custom PyTorch dataset + Custom Transforms + Fusion model (Title Encoder + Description Encoder + Categorical Encoder)
- Improved model: In progress
- Explaining model predictions: In progress
Multiclass classification
DL
CV
Transfer Learning
Notebook
The task is to build classifier using ConvNets to classify images from Simpsons series onto 42 classes.
Implementation includes:
- Data preparation:
- Label Encoding
- Modeling:
- Baseline: Custom PyTorch dataset + Torchvision Transforms + Finetuning
vgg16_bn
- Improved model: In progress
- Baseline: Custom PyTorch dataset + Torchvision Transforms + Finetuning
Multiclass classification
Classic ML
Tabular
Feature Engineering
Notebook
The task is to match customers of telecom company based on their characteristics.
Implementation includes:
- Exploratory Data Analysis
- Finding & fixing errors in features
- Memory optimization
- Checking data integrity
- Modeling:
- Baseline: TF-IDF + LogisticRegression + GridSearch
- Final: In progress
- Explaining model predictions: ELI5