GitHub - sds-dubois/DeepMining: Auto-tuning Data Science Pipelines

Deep Mining

This project is part of the Human Data Interaction project at CSAIL, MIT. The main repo is going through a lot of improvements and will be publicly available soon. This is a fork of the first version.

References:

Sample, Estimate, Tune: Scaling Bayesian Auto-Tuning of Data Science Pipelines
Alec Anderson ; Sebastien Dubois ; Alfredo Cuesta-infante ; Kalyan Veeramachaneni
IEEE International Conference on Data Science and Advance Analytics, 2017
Deep Mining: Copula-based Hyper-Parameter Optimization for Machine Learning Pipelines
Sebastien Dubois - Research thesis

Overview

The Deep Mining project aims at finding the best hyperparameter set for a Machine Learning pipeline. A pipeline example for the handwritten digit recognition problem is presented below. Some hyperparameters indeed need to be set carefully, as the degree for the polynomial kernel of the SVM. Choosing the value of such hyperparameters can be a very difficult task and this project's goal is to make it much easier.

This software will test iteratively, and smartly, some hyperparameter sets in order to find as quickly as possible the best ones to achieve the best classification accuracy that a pipeline can offer.

Methods

The folder GCP-HPO contains all the code implementing the Gaussian Copula Process (GCP) and a hyperparameter optimization (HPO) technique based on it. Gaussian Copula Process can be seen as an improved version of the Gaussian Process, that does not assume a Gaussian prior for the marginal distributions but lies on a more complex prior. This new technique is proved to outperform GP-based hyperparameter optimization, which is already far better than the randomized search.

A paper explaining the GCP approach as well as the hyperparameter process is currently being written and will be linked here as soon as possible. Please consider citing it if you use this work.

Contributors

Sebastien Dubois

Name		Name	Last commit message	Last commit date
Latest commit History 245 Commits
gcp_hpo		gcp_hpo
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Mining

Overview

Methods

Contributors

About

Releases

Packages

Languages

sds-dubois/DeepMining

Folders and files

Latest commit

History

Repository files navigation

Deep Mining

Overview

Methods

Contributors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages