Energy-based probabilistic models learned by maximizing the likelihood of the data are limited by the intractability of the partition function. A widely used workaround is to maximize the pseudo-likelihood, which replaces the global normalization with tractable local normalizations. Here we show that, in the zero-temperature limit, a network trained to maximize pseudo-likelihood naturally implements an associative memory: if the training set is small, patterns become fixed-point attractors whose basins of attraction exceed those of any classical Hopfield rule. We explain quantitatively this effect on uncorrelated random patterns. Moreover, we show that, for different structured datasets coming from computer science (random feature model, MNIST), physics (spin glasses) and biology (proteins), as the number of training examples increases the learned network goes beyond memorization, developing attractors strongly correlated with test examples, thus showing the ability to generalize. Our results therefore reveal pseudo-likelihood works both as an efficient inference tool and as a principled mechanism for memory and generalization.
Pseudo-likelihood produces associative memories able to generalize, even for asymmetric couplings / D'Amico, Francesco; Bocchi, Dario; Del Bono, Luca Maria; Rossi, Saverio; Negri, Matteo. - In: PHYSICA. A. - ISSN 0378-4371. - 692:(2026), pp. 1-16. [10.1016/j.physa.2026.131497]
Pseudo-likelihood produces associative memories able to generalize, even for asymmetric couplings
Dario Bocchi;Luca Maria Del Bono;Saverio Rossi;Matteo Negri
2026
Abstract
Energy-based probabilistic models learned by maximizing the likelihood of the data are limited by the intractability of the partition function. A widely used workaround is to maximize the pseudo-likelihood, which replaces the global normalization with tractable local normalizations. Here we show that, in the zero-temperature limit, a network trained to maximize pseudo-likelihood naturally implements an associative memory: if the training set is small, patterns become fixed-point attractors whose basins of attraction exceed those of any classical Hopfield rule. We explain quantitatively this effect on uncorrelated random patterns. Moreover, we show that, for different structured datasets coming from computer science (random feature model, MNIST), physics (spin glasses) and biology (proteins), as the number of training examples increases the learned network goes beyond memorization, developing attractors strongly correlated with test examples, thus showing the ability to generalize. Our results therefore reveal pseudo-likelihood works both as an efficient inference tool and as a principled mechanism for memory and generalization.| File | Dimensione | Formato | |
|---|---|---|---|
|
DAmico_Pseudo-likelihood_2026.pdf
accesso aperto
Note: Articolo su rivista
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
2.02 MB
Formato
Adobe PDF
|
2.02 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


