The sole purpose of this repository is to understand the concepts of machine learning algorithms - how they work under their hood.
The code was not designed to be used in production (poorly documented, has no unit tests) - for that we have Scikit-learn and other optimized ML frameworks.
- K-nearest neighbors
implementation
- Linear Regression
implementation
- Decision Trees
investigation
- Dimensionality reduction
- Clustering
- K-Means implementation
implementation
- K-Means implementation
- MLcourse.ai
- Machine Learning in Action by Peter Harrington
- Machine Learning with PyTorch and Scikit-Learn by Sebastian Raschka
- Scikit-Learn documentation
- StatQuest videos
- ...
- Add implementation with examples for the next algorithms:
- Linear regression using SGD with regularization
- Linear regression using formula through SVD
- Logistic regression using SGD with regularization with logistic loss (labels 0/1)
- Logistic regression using SGD with regularization (labels -1/1)
- word2vec implementation with SGD optimization and the choice of training method (naive softmax, negative sampling)
- Decision Tree implementation for regression and classification