RT Journal Article
T1 Equilibrium and non-equilibrium regimes in the learning of restricted Boltzmann machines
A1 Decelle, Aurelien Fabrice
A1 Furtlehner, Cyril
A1 Seoane Bartolomé, Beatriz
AB Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurationsfrom a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, k, used to approximate the gradient. We further show empirically that this mixingtime increases with the learning, which often implies a transition from one regime to another as soon as k becomes smaller than this time. In particular, we show that using the popular k (persistent) contrastive divergence approaches, with k small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short k can be used to generate convincing samples in short learning times, (ii) large k (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.
PB IOP Publishing
YR 2022
FD 2022
LK https://hdl.handle.net/20.500.14352/117467
UL https://hdl.handle.net/20.500.14352/117467
LA eng
NO Decelle, A., Furtlehner, C., & Seoane, B. (2021). Equilibrium and non-equilibrium regimes in the learning of restricted Boltzmann machines. Advances in Neural Information Processing Systems, 34, 5345-5359.
NO "Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)"NIPS'21: 35th International Conference on Neural Information Processing Systems December 6 - 14, 2021Edited by: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman VaughanISBN: 9781713845393PR44/21-299372019-T1/TIC-132982019-T1/TIC-12776
NO Comunidad de Madrid (España)
NO Universidad Complutense de Madrid
NO Ministerio de Ciencia e Innovación (España)
NO Agencia Estatal de Investigación (España)
NO European Commission
NO Banco Santander (España)
DS Docta Complutense
RD 19 jun 2026