Distributed learning is the problem of inferring a function when data to be analyzed is distributed across a network of agents. Separate domains of application may largely impose different constraints on the solution, including low computational power at every location, limited underlying connectivity (e.g. no broadcasting capability) or transferability constraints related to the enormous bandwidth requirement. Thus, it is no longer possible to send data in a central node where traditionally learning algorithms are used, while new techniques able to model and exploit locally the information on big data are necessary. Motivated by these observations, this thesis proposes new techniques able to efficiently overcome a fully centralized implementation, without requiring the presence of a coordinating node, while using only in-network communication. The focus is given on both supervised and unsupervised distributed learning procedures that, so far, have been addressed only in very specific settings only. For instance, some of them are not actually distributed because they just split the calculation between different subsystems, others call for the presence of a fusion center collecting at each iteration data from all the agents; some others are implementable only on specific network topologies such as fully connected graphs. In the first part of this thesis, these limits have been overcome by using spectral clustering, ensemble clustering or density-based approaches for realizing a pure distributed architecture where there is no hierarchy and all agents are peer. Each agent learns only from its own dataset, while the information about the others is unknown and obtained in a decentralized way through a process of communication and collaboration among the agents. Experimental results, and theoretical properties of convergence, prove the effectiveness of these proposals. In the successive part of the thesis, the proposed contributions have been tested in several real-word distributed applications. Telemedicine and e-health applications are found to be one of the most prolific area to this end. Moreover, also the mapping of learning algorithms onto low-power hardware resources is found as an interesting area of applications in the distributed wireless networks context. Finally, a study on the generation and control of renewable energy sources is also analyzed. Overall, the algorithms presented throughout the thesis cover a wide range of possible practical applications, and trace the path to many future extensions, either as scientific research or technological transfer results.

Distributed Learning for Multiple Source Data / Altilio, Rosa. - (2018 Feb 22).

Distributed Learning for Multiple Source Data

ALTILIO, ROSA
22/02/2018

Abstract

Distributed learning is the problem of inferring a function when data to be analyzed is distributed across a network of agents. Separate domains of application may largely impose different constraints on the solution, including low computational power at every location, limited underlying connectivity (e.g. no broadcasting capability) or transferability constraints related to the enormous bandwidth requirement. Thus, it is no longer possible to send data in a central node where traditionally learning algorithms are used, while new techniques able to model and exploit locally the information on big data are necessary. Motivated by these observations, this thesis proposes new techniques able to efficiently overcome a fully centralized implementation, without requiring the presence of a coordinating node, while using only in-network communication. The focus is given on both supervised and unsupervised distributed learning procedures that, so far, have been addressed only in very specific settings only. For instance, some of them are not actually distributed because they just split the calculation between different subsystems, others call for the presence of a fusion center collecting at each iteration data from all the agents; some others are implementable only on specific network topologies such as fully connected graphs. In the first part of this thesis, these limits have been overcome by using spectral clustering, ensemble clustering or density-based approaches for realizing a pure distributed architecture where there is no hierarchy and all agents are peer. Each agent learns only from its own dataset, while the information about the others is unknown and obtained in a decentralized way through a process of communication and collaboration among the agents. Experimental results, and theoretical properties of convergence, prove the effectiveness of these proposals. In the successive part of the thesis, the proposed contributions have been tested in several real-word distributed applications. Telemedicine and e-health applications are found to be one of the most prolific area to this end. Moreover, also the mapping of learning algorithms onto low-power hardware resources is found as an interesting area of applications in the distributed wireless networks context. Finally, a study on the generation and control of renewable energy sources is also analyzed. Overall, the algorithms presented throughout the thesis cover a wide range of possible practical applications, and trace the path to many future extensions, either as scientific research or technological transfer results.
22-feb-2018
File allegati a questo prodotto
File Dimensione Formato  
Tesi dottorato Altilio

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 13.48 MB
Formato Adobe PDF
13.48 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1079281
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact