Gene expression microarrays provide a characterisation of the transcriptional activity of a particular biological sample. Their high-dimensionality hampers the process of pattern recognition and extraction. Several approaches have been proposed for gleaning information about the hidden structure of the data. Among these approaches, deep generative models provide a powerful way for approximating the manifold on which the data reside. Here, we develop GEESE, a deep learning-based framework that provides novel insight into the manifold learning for gene expression data, employing a metabolic model to constrain the learned representation. We evaluated the proposed framework, showing its ability to capture biologically relevant features and encoding these features in a much simpler latent space. We showed how using a metabolic model to drive the autoencoder learning process helps in achieving better generalisation to unseen data. GEESE provides a novel perspective on the problem of unsupervised learning for biological data.
Metabolically driven latent space learning for gene expression data / Barsacchi, M.; Andres-Terre, H.; Lio, P.. - (2022), pp. 131-155. [10.1142/9781800610941_0005].
Metabolically driven latent space learning for gene expression data
Lio P.
2022
Abstract
Gene expression microarrays provide a characterisation of the transcriptional activity of a particular biological sample. Their high-dimensionality hampers the process of pattern recognition and extraction. Several approaches have been proposed for gleaning information about the hidden structure of the data. Among these approaches, deep generative models provide a powerful way for approximating the manifold on which the data reside. Here, we develop GEESE, a deep learning-based framework that provides novel insight into the manifold learning for gene expression data, employing a metabolic model to constrain the learned representation. We evaluated the proposed framework, showing its ability to capture biologically relevant features and encoding these features in a much simpler latent space. We showed how using a metabolic model to drive the autoencoder learning process helps in achieving better generalisation to unseen data. GEESE provides a novel perspective on the problem of unsupervised learning for biological data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.