Complex systems abound in the natural and social world, e.g across systems as apparently diverse such as the human brain, the immune system, economics, and the worldwide web. Yet despite three decades of intense research activity in studying complexity, many big issues remain only partially resolved, including a good quantitative definition for a complex system. Many approaches, from time series analysis and stochastic modeling, have been proposed to model the behavior of complex systems based on observed time series by separating the systems’ behavior between observed macroscopic and hidden microscopic scales. One of the most flexible and easy to implement, in the context of linear signal processing, is the Vector Autoregressive Model (VAR) whose identification process is at the basis of the most used estimators for analyzing the statistical dependencies between different time series representing the activity of the entire dynamical system. However, the identification procedure for specific combinations of the number of processes-number of observations in the time series could lead to a severe correlation between the regressors resulting in high bias and variance in the used estimator, which can be counteracted with the use of penalized regression techniques. The first part of this thesis work has been focused on introducing and testing multivariate convex regression methodologies, as a tool for estimating the statistical dependencies among different dynamical systems. Since that there are no extensive studies available that assess the performance of different penalized regression techniques in different experimental conditions, in Chapter 1 I report a comparative analysis among different penalized regression techniques in the context of convex optimization which guarantees the existence of a solution to the VAR identification problem. Another important tool for investigating and quantifying information processing is represented by the Information theory that has already been proved to be a useful framework for the design and analysis of complex self-organized systems. In this context, it has been recently introduced a tool able to compute any measure of information dynamics from the parameters of a VAR model used to characterizes an observed multivariate Gaussian process even in combination with state-space modeling. Motivated by the fact that penalized regression techniques were not yet introduced and tested for the decomposition of information processing, Chapter 2 it is investigated the possibility to integrate the so-called LASSO regression, in a framework for the computation of these measures. The results of Chapters 1 and 2 clearly demonstrated that could be computationally very onerous, especially if combined with state-space modeling and in conditions of very long time series and dynamical systems with a very high number of processes. For this reason, in Chapter 3 we tried to overcome this computational limitation by introducing an Artificial Neural Network equivalent to a VAR model. In particular, thanks to a new training algorithm based on Stochastic gradient descent it has been possible to induce sparsity in the weights matrix of the network during the training phase, but with a less computational cost if compared with traditional LASSO implementation. This new tool was then combined with a state-space model and using for Granger causality (GC) estimation and tested on different real complex systems. Given the results of Chapters 1, 2, and 3 in Chapter 4 an extensive analysis of the performance of different methods in estimating GC, was performed. In particular, due to the high dimension of the observed data, the “curse of dimensionality” may arise leading to unreliable estimation of direct causality. With the aim of carrying out an extensive comparative study, the performance of different methodologies, available in the current literature and explored in this thesis work, for the estimation of GC have been compared. Furthermore, we provided an implementation in combination with space state models for the methods that were not previously tested with this strategy. The performance of all the methods for GC estimation combined or not with state-space models have been tested in two different simulation studies. A conclusion summarizing the main contributions of this Ph.D. project, together with their impact and limitations, closes this dissertation.
Estimating functional networks in dynamical biological systems through convex optimization theory / Antonacci, Yuri. - (2021 May 19).
Estimating functional networks in dynamical biological systems through convex optimization theory
ANTONACCI, YURI
19/05/2021
Abstract
Complex systems abound in the natural and social world, e.g across systems as apparently diverse such as the human brain, the immune system, economics, and the worldwide web. Yet despite three decades of intense research activity in studying complexity, many big issues remain only partially resolved, including a good quantitative definition for a complex system. Many approaches, from time series analysis and stochastic modeling, have been proposed to model the behavior of complex systems based on observed time series by separating the systems’ behavior between observed macroscopic and hidden microscopic scales. One of the most flexible and easy to implement, in the context of linear signal processing, is the Vector Autoregressive Model (VAR) whose identification process is at the basis of the most used estimators for analyzing the statistical dependencies between different time series representing the activity of the entire dynamical system. However, the identification procedure for specific combinations of the number of processes-number of observations in the time series could lead to a severe correlation between the regressors resulting in high bias and variance in the used estimator, which can be counteracted with the use of penalized regression techniques. The first part of this thesis work has been focused on introducing and testing multivariate convex regression methodologies, as a tool for estimating the statistical dependencies among different dynamical systems. Since that there are no extensive studies available that assess the performance of different penalized regression techniques in different experimental conditions, in Chapter 1 I report a comparative analysis among different penalized regression techniques in the context of convex optimization which guarantees the existence of a solution to the VAR identification problem. Another important tool for investigating and quantifying information processing is represented by the Information theory that has already been proved to be a useful framework for the design and analysis of complex self-organized systems. In this context, it has been recently introduced a tool able to compute any measure of information dynamics from the parameters of a VAR model used to characterizes an observed multivariate Gaussian process even in combination with state-space modeling. Motivated by the fact that penalized regression techniques were not yet introduced and tested for the decomposition of information processing, Chapter 2 it is investigated the possibility to integrate the so-called LASSO regression, in a framework for the computation of these measures. The results of Chapters 1 and 2 clearly demonstrated that could be computationally very onerous, especially if combined with state-space modeling and in conditions of very long time series and dynamical systems with a very high number of processes. For this reason, in Chapter 3 we tried to overcome this computational limitation by introducing an Artificial Neural Network equivalent to a VAR model. In particular, thanks to a new training algorithm based on Stochastic gradient descent it has been possible to induce sparsity in the weights matrix of the network during the training phase, but with a less computational cost if compared with traditional LASSO implementation. This new tool was then combined with a state-space model and using for Granger causality (GC) estimation and tested on different real complex systems. Given the results of Chapters 1, 2, and 3 in Chapter 4 an extensive analysis of the performance of different methods in estimating GC, was performed. In particular, due to the high dimension of the observed data, the “curse of dimensionality” may arise leading to unreliable estimation of direct causality. With the aim of carrying out an extensive comparative study, the performance of different methodologies, available in the current literature and explored in this thesis work, for the estimation of GC have been compared. Furthermore, we provided an implementation in combination with space state models for the methods that were not previously tested with this strategy. The performance of all the methods for GC estimation combined or not with state-space models have been tested in two different simulation studies. A conclusion summarizing the main contributions of this Ph.D. project, together with their impact and limitations, closes this dissertation.File | Dimensione | Formato | |
---|---|---|---|
Tesi_dottorato_Antonacci.pdf
Open Access dal 20/05/2022
Tipologia:
Tesi di dottorato
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
9.59 MB
Formato
Adobe PDF
|
9.59 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.