Skip to content

Latest commit

 

History

History
86 lines (63 loc) · 4.79 KB

README.md

File metadata and controls

86 lines (63 loc) · 4.79 KB

SkiREIL Algorithm for Reinforcement Learning with Skill-based Systems using Behavior Trees

SkiREIL is approach to learn (single- and) multi-objective tasks by utilizing existing skills that expose learnable parameters. Targeted at industrial environments where tasks often have several objectives (KPIs) such as task performance and safety, it provides a way to retrieve a large variety of solutions from which the robot operator can choose from. By integrating planning and a knowledge framework it can offer a pipeline from high-level goal definition to the execution of learned policies on a real system while conducting learning itself in highly parallelized simulations.

What you should expect

This algorithm is a model-based policy search algorithm (the ICML 2015 tutorial on policy search methods for robotics is a good source for reading) with the following main properties:

  • it integrates with a policy representation that utilizes SkiROS2 for skill representation and a motion generator implementation
  • it provides a json format to quickly define and adapt learning scenarios --> see the config folder
  • functionalities to implement reward functions, specific scenes and robot setups that can be specified in the scenario configuration file
  • uses a simulator to model the dynamics of the robot/system
  • is data-efficient or sample-efficient by utilizing Bayesian optimization, but can also use other optimization strategies
  • parallelized evalution in simulation to profit from multi-core systems,
  • it imposes no constraints on the type of the reward function (more specifically the reward function can also be learned from data)
  • in principle it imposes no constraints on the type of the policy representation (any parameterized policy can be used --- for example, dynamic movement primitives or neural networks)

Documentation

The documentation can be found in the docs folder. The entry point Readme has a table of contents.

Cloning and Compilation

More extensive documentation on how to clone, install and compile the code is in docs/cloning_and_compiling.md.

In short it would work like this:

git clone https://github.com/matthias-mayr/SkiREIL /tmp/skireil_installation
cd ~
/tmp/skireil_installation/scripts/installation/install_skireil.sh

Start learning with SkiROS support

Open a terminal and enter:

tmux
scripts/experiments/launch_experiment_in_tmux.sh
# Specify in the lower right terminal which configuration file to use, so e.g.:
scripts/experiments/run_learning.sh 1 config/peg_insertion.json

If it is a multi-objective task, a postprocessing step needs to be run with:

roscd skireil && python3 scripts/experiments/create_param_config_and_pareto.py /tmp/skireil/learning_skills

More thorough documentation on how to start learning and what to expect is in docs/learning_and_replaying.md.

Code developers/maintainers

Black-DROPS is orginally written under the ResiBots ERC Project (http://www.resibots.eu).

Citing this Algorithm

Used in:

  • IROS 2021 paper: Learning of Parameters in Behavior Trees for Movement Skills
  • IROS 2022 submission: Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration

Based on code for the:

  • IROS 2017 paper: "Black-Box Data-efficient Policy Search for Robotics"
  • ICRA 2018 paper: "Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics"

If you use our code for a scientific paper, please cite:

Mayr, M., Ahmad, F., Chatzilygeroudis, K., Nardi, L., Krueger, V. (2022) Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration. Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

In BibTex:

@misc{https://doi.org/10.48550/arxiv.2203.10033,
  doi = {10.48550/ARXIV.2203.10033},
  url = {https://arxiv.org/abs/2203.10033},
  author = {Mayr, Matthias and Ahmad, Faseeh and Chatzilygeroudis, Konstantinos and Nardi, Luigi and Krueger, Volker},
  keywords = {Robotics (cs.RO), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration},
  publisher = {arXiv},
  year = {2022},  
  copyright = {Creative Commons Attribution Share Alike 4.0 International}
    }