This is a collection of reasearch and review papers for Offline to Online Reinforcement Learning (RL) (or Offline Online RL). Feel free to star and fork. This site is mainly inspired by the Awesome Offline RL list, please see them if you interested in newest research papers related to Offline RL too.
Maintainers:
- Linh LPV (A2I2, Deakin University)
Please feel free to pull request with the intructions provided in Contributing.
Format:
- [title](paper linnk) [links]
- author 1, author 2, et al. arXiv/conferences/journals, month/year.
For any questions, feel free to contact: l.le@deakin.edu.au
Credit:
- The Contributing instruction is based on the workflow of awesome-exploration-rl .
- Pretraining in Deep Reinforcement Learning: A Survey
- Zhihui Xie, Zichuan Lin, Junyou Li, Shuai Li, Deheng Ye. arXiv. 11/2022.
-
MOORe: Model-based Offline-to-Online Reinforcement Learning
- Yihuan Mao, Chao Wang, Bin Wang, Chongjie Zhang. arXiv 2022. 01/2022.
-
Launchpad: Learning to Schedule Using Offline and Online RL Methods
- Vanamala Venkataswamy, Jake Grigsby, Andrew Grimshaw, Yanjun Qi. arXiv. 12/2022.x
-
Improving Offline-to-Online Reinforcement Learning with Q Conditioned State Entropy Exploration
- Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang. arXiv. 10/2023.
-
Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration
- Jinning Li, Xinyi Liu, Banghua Zhu, Jiantao Jiao, Masayoshi Tomizuka, Chen Tang, Wei Zhan. arXiv. 09/2023.
-
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness
- Xiaoyu Wen, Xudong Yu, Rui Yang, Chenjia Bai, Zhen Wang. arXiv. 09/2023.
-
Sample Efficient Reward Augmentation in offline-to-online Reinforcement Learning
- Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang. arXiv. 10/2023.
-
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets page
- Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine. arXiv. 06/2020.
-
Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets
- Seunghyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin. Offline RL workshop. 12/2020.
-
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
- Seunghyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin. CoRL 2022. 07/2021.
-
Jump-Start Reinforcement Learning
- Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, Sergey Levine, Karol Hausman. ICML 2023. 04/22.
-
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress page
- Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare. NeurIPS 2022. 06/2022.
-
Don’t Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning page
- Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine. CoRL 2022. 07/2022.
-
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
- Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun. ICLR 2023. 10/2022.
-
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
- Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine. RSS. 10/2022.
-
MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning
- Rafael Rafailov, Kyle Beltran Hatch, Victor Kolev, John D Martin, Mariano Phielipp, Chelsea Finn. ICLR 2022 Workshop: Reincarnating Reinforcement Learning.
-
Launchpad: Learning to Schedule Using Offline and Online RL Methods
- Vanamala Venkataswamy, Jake Grigsby, Andrew Grimshaw, Yanjun Qi. arXiv. 12/2022.
-
Guiding Online Reinforcement Learning with Action-Free Offline Pretraining
- Deyao Zhu, Yuhui Wang, Jürgen Schmidhuber, Mohamed Elhoseiny. arXiv. 01/2023.
-
- Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine. ICRA 2023.
-
Efficient Online Reinforcement Learning with Offline Data
- Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine. ICML 2023. 02/2023.
-
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
- Haichao Zhang, We Xu, Haonan Yu. ICLR 2023. 02/2023.
-
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
- Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine. NeurIPS 2023. 03/2023.
-
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
- Han Zheng, Xufang Luo, Pengfei Wei, Xuan Song, Dongsheng Li, Jing Jiang. AAAI 2023. 03/2023.
-
Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
- Yicheng Luo, Jackie Kay, Edward Grefenstette, Marc Peter Deisenroth. arXiv. 03/2023.
-
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
- Jianxiong Li, Xiao Hu, Haoran Xu, Jingjing Liu, Xianyuan Zhan, Ya-Qin Zhang. arXiv. 05/2023.
-
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
- Tianying Ji, Yu Luo, Fuchun Sun, Xianyuan Zhan, Jianwei Zhang, Huazhe Xu. arXiv. 06/2023.
-
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning
- Siyuan Guo, Yanchao Sun, Jifeng Hu, Sili Huang, Hechang Chen, Haiyin Piao, Lichao Sun, Yi Chang. arXiv. 06/2023.
-
Sample Efficient Offline-to-Online Reinforcement Learning
- Siyuan Guo, Lixin Zou, et al. IEEE Transactions on Knowledge and Data Engineering ( Early Access ). 08/2023.
-
Adaptive Offline Data Replay in Offline-to-Online Reinforcement Learning
- . reviewed at ICLR 2024.
-
Bayesian Offline-to-Online Reinforcement Learning : A Realist Approach
- Hao Hu, Yiqin Yang, Jianing Ye, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang. reviewed at ICLR 2024 -> Accepted at ICML 2024.
-
SERA: Sample Efficient Reward Augmentation in offline-to-online Reinforcement Learning
- Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang. reviewed at ICLR 2024.
-
Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
- Trevor McInroe, Stefano V. Albrecht, Amos Storkey. arXiv, reviewed at ICLR 2024. 10/2023.
-
Offline RL for Online RL: Decoupled Policy Learning for Mitigating Exploration Bias
- Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn. arXiv, reviewed at ICLR 2024. 10/2023.
-
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning
- Siyuan Guo, Yanchao Sun, Jifeng Hu, Sili Huang, Hechang Chen, Haiyin Piao, Lichao Sun, Yi Chang. reviewed at ICLR 2024.
-
- Kun Lei, Zhengmao He, Chenhao Lu, Kaizhe Hu, Yang Gao, Huazhe Xu. reviewed at ICLR 2024.
-
- . reviewed at ICLR 2024.
-
Guided Decoupled Exploration for Offline Reinforcement Learning Fine-tuning
- . reviewed at ICLR 2024.
-
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
- Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun. reviewed at ICLR 2024.
-
Collaborative World Models: An Online-Offline Transfer RL Approach
- Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang. reviewed at ICLR 2024.
-
SUF: Stabilized Unconstrained Fine-Tuning for Offline-to-Online Reinforcement Learning
- Jiaheng Feng, Mingxiao Feng, Haolin Song, Wengang Zhou, Houqiang Li. AAAI 2024.
-
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
- Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang. AAAI 2024.
-
Efficient and Stable Offline-to-online Reinforcement Learning via Continual Policy Revitalization
- Rui Kong, Chenyang Wu, Chen-Xiao Gao, Zongzhang Zhang, Ming Li. IJCAI 2024.
-
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
- Kai Zhao, Yi Ma, Jianye Hao, Jinyi Liu, Yan Zheng, Zhaopeng Meng. IJCAI 2024. arXiv. 06/2023.
-
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
- Hao Hu, Yiqin Yang, Jianing Ye, Chengjie Wu, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang. ICML 2024.
-
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
- Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Ruifeng Chen, Zhilong Zhang, Xinwei Chen, Yang Yu. ICML 2024.
-
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
- Sheng Yue, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang. ICML 2024.
-
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
- Yu Luo, Tianying Ji, Fuchun Sun, Jianwei Zhang, Huazhe Xu, Xianyuan Zhan. ICML 2024.
- Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning
- Trevor McInroe, Stefano V. Albrecht, Amos Storkey. arXiv, reviewed at ICLR 2024.RLC 2024.