Introduction | Learning Path | Prerequisites | Nanodegree | Contact
Released Github Pages: https://khoispace.io.vn/machine-learning-learning-path/
Udacity is a for-profit educational organization which offering massive open online courses (MOOCs). This platform has many courses in many fields such as Data Science, Machine Learning, Deep Learning, Artificial Intelligence, etc. Udacity has many nanodegrees which are a series of courses in a specific field.
Link: https://www.udacity.com/
This repository is my learning path in Udacity. I will update my learning path in Machine Learning. I hope it will help you to learn Machine Learning on Udacity with many references.
# | Technology | Description |
---|---|---|
1 | Python | Python is an interpreted, high-level, general-purpose programming language. |
2 | Numpy | NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. |
3 | Pandas | Pandas is a software library written for the Python programming language for data manipulation and analysis. |
4 | Matplotlib | Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. |
5 | Seaborn | Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. |
6 | Scikit-learn | Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. |
7 | Pytorch | PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license. |
8 | Git | Git is software for tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows. |
9 | Docker | Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers. |
10 | Kubernetes | Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation. |
11 | AWS | Amazon Web Services is a subsidiary of Amazon providing on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. |
NanoDegree is a series of courses in a specific field. It is a paid course. You can learn more about Nanodegree here.
Noted nanodegrees: Udacity - Google drive
- Syllabus: Syllabus
- Noted course: Google Drive
Projects:
- Learn the data analysis process of questioning, wrangling, exploring, analyzing, and communicating data. You will work with data in Python using libraries like NumPy and pandas.
- Project: Investigate a Dataset
- Source code: vnk8071/investigate_a_dataset
- Data wrangling is a set of processes for turning raw and messy data into a clean format to answer interesting questions from the data. In this course, you will learn the three phases of data wrangling: gathering, assessing, and cleaning data.
- Project: Real World Data Wrangling with Python
- Source code: vnk8071/advanced_data_wrangling
- Learn to apply sound design and data visualization principles to the data analysis process. Learn how to use analysis and visualizations to tell a story with data.
- Project: Communicate Data Findings
- Source code: vnk8071/communicate_data_findings
- Syllabus: Syllabus
- Noted course: Google Drive
Projects:
- In this course, you will learn advanced Python skills and master a myriad of modern subject matter.
- Project: Exploring Near-Earth Objects
- Source code: vnk8071/advanced_python_techniques
- Learn how you can write, structure, and extend your code to be able to support developing large systems at scale. Understand how you can leverage open source libraries to quickly add advanced functionality to your code and how you can package your code into libraries of your own. Apply Object Oriented Programming to ensure that your code remains modular, clear, and understandable. Honing these skills are the foundations for building codebases that are maintainable and efficient as they grow to tens of thousands of lines.
- Project: Motivational Meme Generator
- Source code: vnk8071/meme_generator
- Syllabus: Syllabus
- Noted course: Google Drive
Projects:
- You'll learn the skills needed to traverse the stack and develop an entire database-backed web application. By the end of the course, you'll have the fundamentals you need to start building web applications, including how to do Create, Read, Update, and Delete (CRUD) operations on a database, how to apply these operations across both databases and web applications, how to set up relationships between elements of an application, and ultimately how to think about important principles and patterns in building data models for a web application
- Project: Fyyur: Artist Booking Site
- Source code: vnk8071/sql_data_modeling_for_web
- In this project, you will use the skills you’ve developed to build a Trivia API. The goal of this project is to use APIs to control and manage a web application using existing data models. You’ll be given a set of data models and the application front end. Your task will be to implement the API in Flask to make the Trivia game functional.
- Project: Trivia API
- Source code: vnk8071/api_development_documentation
- In this part, you will build the backend for a coffee shop application. You’ll add user accounts and authentication to your application and use role-based access management strategies to control different types of user behavior in the app.
- Project: Coffee Shop Full Stack
- Source code: vnk8071/identity_access_management
- Develop an understanding of containerized environments, use Docker to share and store containers, and deploy a Docker container to AWS Elastic Kubernetes Service using the CI/CD pipeline.
- Project: Deploy Your Flask App to Kubernetes Using EKS
- Source code: vnk8071/server_deployment_containerization
- Template: vnk8071/pipeline-deploy-kubernetes-on-aws
- You will now combine all of the new skills you’ve learned and developed in this course to construct a database-backed web API with user access control. You will choose what app to build and then you’ll design and build out all of the API endpoints needed for the application and properly secure them for use in any front end application (web or mobile).
- Capstone Coffee Shop Fullstack
- Source code: vnk8071/capstone_fullstack_web
- Syllabus: Syllabus
- Noted course: Google Drive
Projects:
- In this course, you'll start learning what machine learning is by being introduced to the high level concepts through AWS SageMaker. You'll begin by using SageMaker Studio to perform exploratory data analysis. Know how and when to apply the basic concepts of machine learning to real world scenarios. Create machine learning workflows, starting with data cleaning and feature engineering, to evaluation and hyperparameter tuning. Finally, you'll build new ML workflows with highly sophisticated models such as XGBoost and AutoGluon.
- Project: Predict Bike Sharing Demand with AutoGluon
- Source code: vnk8071/predict_bike_sharing_demand
- This course discusses how to use AWS services to train a model, deploy a model, and how to use AWS Lambda Functions, Step Functions to compose your model and services into an event-driven application.
- Project: Build a ML Workflow For Scones Unlimited On Amazon SageMaker
- Source code: vnk8071/ml_workflow_for_scones_unlimited
- Project: Image Classification using AWS SageMaker
- Source code: vnk8071/image_classification_using_sagemaker
- This course covers advanced topics related to deploying professional machine learning projects on SageMaker. Students will learn how to maximize output while decreasing costs. They will also learn how to deploy projects that can handle high traffic, how to work with especially large datasets, and how to approach security in machine learning AWS applications.
- Project: Operationalizing an AWS ML Project
- Source code: vnk8071/operationalizing_aws_ml
- Project: Build Your Own Machine Learning Portfolio
- Source code: vnk8071/capstone_ml_engineer
- Syllabus: Syllabus
- Noted course: Google Drive
Projects:
- Develop skills that are essential for deploying production machine learning models. First, you will put your coding best practices on auto-pilot by learning how to use PyLint and AutoPEP8. Then you will further expand your git and Github skills to work with teams. Finally, you will learn best practices associated with testing and logging used in production settings in order to ensure your models can stand the test of time.
- Project: Predict Customer Churn with Clean Code
- Source code: vnk8071/clean_code
- This course empowers the students to be more efficient, effective, and productive in modern, real-world ML projects by adopting best practices around reproducible workflows. In particular, it teaches the fundamentals of MLops and how to: a) create a clean, organized, reproducible, end-to-end machine learning pipeline from scratch using MLflow b) clean and validate the data using pytest c) track experiments, code, and results using GitHub and Weights & Biases d) select the best-performing model for production and e) deploy a model using MLflow. Along the way, it also touches on other technologies like Kubernetes, Kubeflow, and Great Expectations and how they relate to the content of the class.
- Project: Build an ML Pipeline for Short-term Rental Prices in NYC
- Source code: vnk8071/reproducible_model_workflow
- This course teaches students how to robustly deploy a machine learning model into production. En route to that goal students will learn how to put the finishing touches on a model by taking a fine grained approach to model performance, checking bias, and ultimately writing a model card. Students will also learn how to version control their data and models using Data Version Control (DVC). The last piece in preparation for deployment will be learning Continuous Integration and Continuous Deployment which will be accomplished using GitHub Actions and Heroku, respectively. Finally, students will learn how to write a fast, type-checked, and auto-documented API using FastAPI.
- Project: Deploy a Machine Learning Model to Cloud Application Platform
- Source code: vnk8071/deploy_ml_pipeline_in_production
- This course will help students automate the devops processes required to score and re-deploy ML models. Students will automate model training and deployment. They will set up regular scoring processes to be performed after model deployment, and also learn to reason carefully about model drift, and whether models need to be retrained and re-deployed. Students will learn to diagnose operational issues with models, including data integrity and stability problems, timing problems, and dependency issues. Finally, students will learn to set up automated reporting with API’s.
- Project: A Dynamic Risk Assessment System
- Source code: vnk8071/ml_model_scoring_and_monitoring
- Gmail: nguyenkhoi8071@gmail.com
- Website: khoispace.io.vn
- LinkedIn: linkedin.com/in/khoivn8071
- Github: github.com/vnk8071