Skip to content

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Published at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Notifications You must be signed in to change notification settings

rssalessio/OSRL-SC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Code for On the Sample Complexity of Representation Learning in Multi-Task Bandits with Global and Local Structure

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Accepted at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Author: Alessio Russo

The code contains not only the algorithm mentioned above, but also KL-UCB [1], D-Track and Stop/D-Track and Stop with challenger modification [2].

All the code has been written in Python or C.

Hardware and Software setup

All experiments were executed on a stationary desktop computer, featuring an Intel Xeon Silver 4110 CPU, 48GB of RAM. Ubuntu 18.04 was installed on the computer. Ubuntu is a open-source Operating System using the Linux kernel and based on Debian. For more information, please check https://ubuntu.com/.

Code and libraries

We set up our experiments using the following software and libraries:

  • Python 3.7.7
  • Cython version 0.29.15
  • NumPy version 1.18.1
  • SciPy version 1.4.1
  • PyTorch version 1.4.0

All the code can be found in the folder src.

Usage

You can run sample simulations by running the Jupyter notebooks located in the folder notebooks.

To run the notebooks you need to install Jupyter first. After that, you can open a shell in the notebooks directory and run

jupyter notebook

This will open the jupyter interface, where you can select which file to run.

License

MIT license.

References

[1] Garivier, Aurélien, and Olivier Cappé. "The KL-UCB algorithm for bounded stochastic bandits and beyond." Proceedings of the 24th annual conference on learning theory. 2011. [2] Garivier, Aurélien, and Emilie Kaufmann. "Optimal best arm identification with fixed confidence." Conference on Learning Theory. 2016.

About

OSRL (Optimal Representation Learning in Multi-Task Bandits) comprises an algorithm that addresses the problem of sample complexity with fixed confidence in Multi-Task Bandit problems. Published at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI23)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published