Skip to content

Latest commit

 

History

History
67 lines (53 loc) · 2.65 KB

README.md

File metadata and controls

67 lines (53 loc) · 2.65 KB

PILCO — Probabilistic Inference for Learning COntrol

PILCO_overview Image source

This is our implementation of PILCO from Deisenroth, et al.
The implementation is largely based on the matlab code and the PhD thesis of Deisenroth. Other cool implementations can be found here and here.

Code structure

  • controller: Controller/policy models.
  • cost_functions: Cost functions for computing a trajectory's performance.
  • gaussian_process: (Sparse) Gaussian Process models for learning dynamics and RBF policy.
  • kernel: Kernel functions for Gaussian Process models.
  • test: Test cases to ensure the implementation is working as intended.
  • util: Helper methods to make main code more readable.

Executing experiments

  1. Activate the anaconda environment
source activate my_env
  1. Execute the pilco_runner script (the default environment is CartpoleStabShort-v0)

Training run from scratch:

python3 my/path/to/pilco_runner.py

Training run from an existing policy:

python3 my/path/to/pilco_runner.py --weight-dir my_model_directory

More console arguments (e.g. hyperparameter changes) can be added to the run, for details see

python3 my/path/to/pilco_runner.py --help

Executing evaluation run for existing policy

  1. Activate the anaconda environment
source activate my_env
  1. Execute the pilco_runner script
python3 my/path/to/pilco_runner.py --weight-dir my_model_directory --test

e.g. load pretrained models in test mode:

CartpoleStabShort-v0 (500Hz)

python3 pilco_runner.py --env-name CartpoleStabShort-v0 --test --max-action 5 --weight-dir experiments/best_models/pilco/stabilization/sparse_gp_50hz/

CartpoleSwingShort-v0 (500Hz)

python3 pilco_runner.py --env-name CartpoleSwingShort-v0 --test --max-action 10 --weight-dir experiments/best_models/pilco/swing_up/sparse_gp_100hz/

Qube-v0 (50Hz)

python3 pilco_runner.py --env-name Qube-v0 --test --weight-dir experiments/best_models/pilco/qube/sparse_gp_100hz/