Gene expression is a complex process that should be regulated in each cell of every organism in order to ensure the proper functioning throughout its life. It is unavoidably subject to several sources of noise, due to the low copy number of molecules or to their diffusive motion inside the cell, that impose physical limits to the reliability of regulation. The regulation is generally performed by networks of molecules (proteins or RNAs) that interact with each other, often defined as gene regulatory networks (GRNs). In this thesis we adopt a statistical physics approach to study the regulation of gene expression at several levels, in different biological contexts. We start with a theoretical treatment about the limits of information flow in simple regulatory networks in which one controller regulates the levels of multiple targets. We consider network parameters as quenched random variables and, using tools from statistical field theory, we characterize the average of the maximum mutual information and its probability distribution induced by this randomness. The relevance of this study lies in the development of an analytical approach to ensembles of simple GRNs and it shows, among other things, that the optimization of network parameters could be less significant than the optimization of the input variable (i.e., the concentration of the controller). This conclusion is made stronger in large networks, where kinetic heterogeneities can be exploited for a reliable information transmission. Since the experimental quantification of gene expression is essential for the detailed study of regulatory processes, afterwards we leave the purely theoretical approach in favor of a data-driven one. Specifically, we extend to single-cell RNA-sequencing (scRNA-seq) data an information theoretic framework that characterizes the information content of samples of complex systems. We use it to evaluate different cells’ clusterings of a scRNA-seq dataset and we show how it can be exploited to identify maximally informative partitions of data. Afterwards, we study the influence of cell-to-cell variability of gene expression (partly caused by its stochastic nature) in the pathogenesis of multiple sclerosis. We work on scRNA-seq data of monozygotic twins discordant for the disease. Working on single-cell data allows to study the variability of gene expression across cells of the same type (in our case B cells), while the monozygotic twins allow to evidence the non-genetic factors which putatively cause the disease. Relying again on the framework mentioned above, we extract lists of “critical genes” from the dataset, for each subject. We refine the method exploiting previously identified markers of the disease, obtaining also lists of critical genes that are linked to them. These results are then crossed with differential expression and differential noise analyses, leading to a list of candidate genes likely linked to multiple sclerosis, which is actually under experimental validation. Finally, the last part of the thesis regards the mathematical modeling of a RNA network based on microRNAs that controls the early stage of myogenesis. Our goal is to analyze the regulatory role played by the two microRNA biosynthesis modes, one controlled by a miRNA-decoy system, the other consisting in its production from an independent genomic locus, and that played by the presence of competition between coding and non coding RNAs to bind microRNAs. We show that the miRNA-decoy system is sufficient to tune molecular levels at steady state, while the alternative locus for miRNA transcription serves a dynamical purpose, allowing the quick concentration shifts required by the differentiation process. This study suggests that these joint regulatory mechanisms could be a common feature of other biological processes.

Statistical physics approach to gene regulatory networks / Fiorentino, Jonathan. - (2019 Feb 20).

Statistical physics approach to gene regulatory networks

FIORENTINO, JONATHAN
20/02/2019

Abstract

Gene expression is a complex process that should be regulated in each cell of every organism in order to ensure the proper functioning throughout its life. It is unavoidably subject to several sources of noise, due to the low copy number of molecules or to their diffusive motion inside the cell, that impose physical limits to the reliability of regulation. The regulation is generally performed by networks of molecules (proteins or RNAs) that interact with each other, often defined as gene regulatory networks (GRNs). In this thesis we adopt a statistical physics approach to study the regulation of gene expression at several levels, in different biological contexts. We start with a theoretical treatment about the limits of information flow in simple regulatory networks in which one controller regulates the levels of multiple targets. We consider network parameters as quenched random variables and, using tools from statistical field theory, we characterize the average of the maximum mutual information and its probability distribution induced by this randomness. The relevance of this study lies in the development of an analytical approach to ensembles of simple GRNs and it shows, among other things, that the optimization of network parameters could be less significant than the optimization of the input variable (i.e., the concentration of the controller). This conclusion is made stronger in large networks, where kinetic heterogeneities can be exploited for a reliable information transmission. Since the experimental quantification of gene expression is essential for the detailed study of regulatory processes, afterwards we leave the purely theoretical approach in favor of a data-driven one. Specifically, we extend to single-cell RNA-sequencing (scRNA-seq) data an information theoretic framework that characterizes the information content of samples of complex systems. We use it to evaluate different cells’ clusterings of a scRNA-seq dataset and we show how it can be exploited to identify maximally informative partitions of data. Afterwards, we study the influence of cell-to-cell variability of gene expression (partly caused by its stochastic nature) in the pathogenesis of multiple sclerosis. We work on scRNA-seq data of monozygotic twins discordant for the disease. Working on single-cell data allows to study the variability of gene expression across cells of the same type (in our case B cells), while the monozygotic twins allow to evidence the non-genetic factors which putatively cause the disease. Relying again on the framework mentioned above, we extract lists of “critical genes” from the dataset, for each subject. We refine the method exploiting previously identified markers of the disease, obtaining also lists of critical genes that are linked to them. These results are then crossed with differential expression and differential noise analyses, leading to a list of candidate genes likely linked to multiple sclerosis, which is actually under experimental validation. Finally, the last part of the thesis regards the mathematical modeling of a RNA network based on microRNAs that controls the early stage of myogenesis. Our goal is to analyze the regulatory role played by the two microRNA biosynthesis modes, one controlled by a miRNA-decoy system, the other consisting in its production from an independent genomic locus, and that played by the presence of competition between coding and non coding RNAs to bind microRNAs. We show that the miRNA-decoy system is sufficient to tune molecular levels at steady state, while the alternative locus for miRNA transcription serves a dynamical purpose, allowing the quick concentration shifts required by the differentiation process. This study suggests that these joint regulatory mechanisms could be a common feature of other biological processes.
20-feb-2019
De Martino, Andrea
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1731344
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact