Single-cell RNA sequencing (scRNA-seq) resolves cellular heterogeneity, but expression alone captures only a slice of cell identity. Regulatory and metabolic activities—read out as transcription factor (TF) activity and metabolic flux—can be inferred from transcripts, yet they are usually analysed separately. We present a functional multi-view integration framework that unifies transcriptional, regulatory, and metabolic information into a single latent representation from one scRNA-seq dataset. From each dataset, we derived four layers: (i) gene expression, (ii) TF regulon activity estimated with pySCENIC, (iii) metabolite features, and (iv) predicted reaction fluxes from scFEA. We integrated these using MOFA+, a probabilistic factor model that decomposes shared and view-specific sources of variation and yields an interpretable embedding for visualisation, clustering, and downstream interpretation. Applied to HER2-enriched breast cancer cell lines, the integrated embedding revealed structure beyond RNA alone. As a baseline, a conventional scRNA-only pipeline (dimensionality reduction, clustering, enrichment) was performed and was dominated by cell-cycle programs. In contrast, the multi-view factors exposed coordinated regulatory and metabolic axes that separate proliferative, oxidative, and stress-adaptive states. Clustering in the shared latent space delineated discrete populations coherent across layers; within each cluster we linked transcriptional modules to upstream TFs and to salient metabolic activities, providing mechanistic consistency. Functional enrichment reflected this added resolution: the integrated model recovered 770 GO Biological Process 2025 pathways versus 507 for scRNA-only (424 shared; 346 unique to integration; 83 unique to scRNA-only), indicating preservation of core transcriptional programs while revealing additional regulatory and metabolic dimensions of heterogeneity. Together, these results establish a rigorous and generalizable approach for extracting mechanistic signals from inferred functional layers without additional assays. By coupling scRNA-seq with MOFA+-based integration of regulon activity and metabolic flux, we obtain a compact, interpretable representation that sharpens cell-state definitions, refines cluster boundaries, and nominates candidate TFs and pathways underpinning phenotypic diversity in HER2-enriched models. The framework is readily extensible to other cancers and tissues and provides a principled route to deeper insight from routine single-cell transcriptomes.

From expression to function: multi-layer integration of inferred regulatory and metabolic states in scRNA-seq data / Napoli, Chiara; Bardozzo, Francesco; Verma, Suraj; Minh Thao Doan, Le; Fiore, Pierpaolo; Faggiano, Carmen; Angione, Claudio; Occhipinti, Annalisa; Tagliaferri, Roberto. - (2025). ( BBCC2025 Naples, Italy ).

From expression to function: multi-layer integration of inferred regulatory and metabolic states in scRNA-seq data

Chiara Napoli
Primo
;
2025

Abstract

Single-cell RNA sequencing (scRNA-seq) resolves cellular heterogeneity, but expression alone captures only a slice of cell identity. Regulatory and metabolic activities—read out as transcription factor (TF) activity and metabolic flux—can be inferred from transcripts, yet they are usually analysed separately. We present a functional multi-view integration framework that unifies transcriptional, regulatory, and metabolic information into a single latent representation from one scRNA-seq dataset. From each dataset, we derived four layers: (i) gene expression, (ii) TF regulon activity estimated with pySCENIC, (iii) metabolite features, and (iv) predicted reaction fluxes from scFEA. We integrated these using MOFA+, a probabilistic factor model that decomposes shared and view-specific sources of variation and yields an interpretable embedding for visualisation, clustering, and downstream interpretation. Applied to HER2-enriched breast cancer cell lines, the integrated embedding revealed structure beyond RNA alone. As a baseline, a conventional scRNA-only pipeline (dimensionality reduction, clustering, enrichment) was performed and was dominated by cell-cycle programs. In contrast, the multi-view factors exposed coordinated regulatory and metabolic axes that separate proliferative, oxidative, and stress-adaptive states. Clustering in the shared latent space delineated discrete populations coherent across layers; within each cluster we linked transcriptional modules to upstream TFs and to salient metabolic activities, providing mechanistic consistency. Functional enrichment reflected this added resolution: the integrated model recovered 770 GO Biological Process 2025 pathways versus 507 for scRNA-only (424 shared; 346 unique to integration; 83 unique to scRNA-only), indicating preservation of core transcriptional programs while revealing additional regulatory and metabolic dimensions of heterogeneity. Together, these results establish a rigorous and generalizable approach for extracting mechanistic signals from inferred functional layers without additional assays. By coupling scRNA-seq with MOFA+-based integration of regulon activity and metabolic flux, we obtain a compact, interpretable representation that sharpens cell-state definitions, refines cluster boundaries, and nominates candidate TFs and pathways underpinning phenotypic diversity in HER2-enriched models. The framework is readily extensible to other cancers and tissues and provides a principled route to deeper insight from routine single-cell transcriptomes.
2025
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1756911
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact