Simulations of multi-modal distributions can be very costly and often lead to unreliable predictions. To accelerate the computations, we propose to sample from a flattened distribution to accelerate the computations and estimate the importance weights between the original distribution and the flattened distribution to ensure the correctness of the distribution.
We refer interested readers to blog here. For Chinese readers, you may also find this blog interesting 知乎.
Methods | Speed | Special features | Cost |
---|---|---|---|
SGLD (ICML'11) | Extremely slow | None | None |
Cycic SGLD (ICLR'20) | Medium | Cyclic learning rates | More cycles |
Replica exchange SGLD (ICML'20) | Fast | Swaps/Jumps | Parallel chains |
Contour SGLD (NeurIPS'20) | Fast | Bouncy moves | Latent vector |
The following is a demo to show how the latent vector is gradually estimated
Although this version of CSGLD has a global statbility condition, it doesn't handle high-loss problems appropriately. For a more scalable version (called ICSGLD), please check the paper [Link] and the code [Link] here. Since importance sampling suffers a lot from the large variance issue, the conditional independence of ICSGLD will provide the algorithm a perfect variance reduction for estimating the density of states.
@inproceedings{CSGLD,
title={A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions},
author={Wei Deng and Guang Lin and Faming Liang},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}
-
Max Welling, Yee Whye Teh. Bayesian Learning via Stochastic Gradient Langevin Dynamics. ICML'11
-
R. Zhang, C. Li, J. Zhang, C. Chen, A. Wilson. Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning. ICLR'20
-
W. Deng, Q. Feng, L. Gao, F. Liang, G. Lin. Non-convex Learning via Replica Exchange Stochastic Gradient MCMC. ICML'20.
-
W. Deng, G. Lin, F. Liang. A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions. NeurIPS'20.