Distributed learning is the problem of inferring a function when data to be analyzed is distributed across a network of agents. Separate domains of application may largely impose different constraints on the solution, including low computational power at every location, limited underlying connectivity (e.g. no broadcasting capability) or transferability constraints related to the enormous bandwidth requirement. Thus, it is no longer possible to send data in a central node where traditionally learning algorithms are used, while new techniques able to model and exploit locally the information on big data are necessary. Motivated by these observations, this thesis proposes new techniques able to efficiently overcome a fully centralized implementation, without requiring the presence of a coordinating node, while using only in-network communication. The focus is given on both supervised and unsupervised distributed learning procedures that, so far, have been addressed only in very specific settings only. For instance, some of them are not actually distributed because they just split the calculation between different subsystems, others call for the presence of a fusion center collecting at each iteration data from all the agents; some others are implementable only on specific network topologies such as fully connected graphs. In the first part of this thesis, these limits have been overcome by using spectral clustering, ensemble clustering or density-based approaches for realizing a pure distributed architecture where there is no hierarchy and all agents are peer. Each agent learns only from its own dataset, while the information about the others is unknown and obtained in a decentralized way through a process of communication and collaboration among the agents. Experimental results, and theoretical properties of convergence, prove the effectiveness of these proposals. In the successive part of the thesis, the proposed contributions have been tested in several real-word distributed applications. Telemedicine and e-health applications are found to be one of the most prolific area to this end. Moreover, also the mapping of learning algorithms onto low-power hardware resources is found as an interesting area of applications in the distributed wireless networks context. Finally, a study on the generation and control of renewable energy sources is also analyzed. Overall, the algorithms presented throughout the thesis cover a wide range of possible practical applications, and trace the path to many future extensions, either as scientific research or technological transfer results.
Distributed Learning for Multiple Source Data / Altilio, Rosa. - (2018 Feb 22).
Distributed Learning for Multiple Source Data
ALTILIO, ROSA
22/02/2018
Abstract
Distributed learning is the problem of inferring a function when data to be analyzed is distributed across a network of agents. Separate domains of application may largely impose different constraints on the solution, including low computational power at every location, limited underlying connectivity (e.g. no broadcasting capability) or transferability constraints related to the enormous bandwidth requirement. Thus, it is no longer possible to send data in a central node where traditionally learning algorithms are used, while new techniques able to model and exploit locally the information on big data are necessary. Motivated by these observations, this thesis proposes new techniques able to efficiently overcome a fully centralized implementation, without requiring the presence of a coordinating node, while using only in-network communication. The focus is given on both supervised and unsupervised distributed learning procedures that, so far, have been addressed only in very specific settings only. For instance, some of them are not actually distributed because they just split the calculation between different subsystems, others call for the presence of a fusion center collecting at each iteration data from all the agents; some others are implementable only on specific network topologies such as fully connected graphs. In the first part of this thesis, these limits have been overcome by using spectral clustering, ensemble clustering or density-based approaches for realizing a pure distributed architecture where there is no hierarchy and all agents are peer. Each agent learns only from its own dataset, while the information about the others is unknown and obtained in a decentralized way through a process of communication and collaboration among the agents. Experimental results, and theoretical properties of convergence, prove the effectiveness of these proposals. In the successive part of the thesis, the proposed contributions have been tested in several real-word distributed applications. Telemedicine and e-health applications are found to be one of the most prolific area to this end. Moreover, also the mapping of learning algorithms onto low-power hardware resources is found as an interesting area of applications in the distributed wireless networks context. Finally, a study on the generation and control of renewable energy sources is also analyzed. Overall, the algorithms presented throughout the thesis cover a wide range of possible practical applications, and trace the path to many future extensions, either as scientific research or technological transfer results.File | Dimensione | Formato | |
---|---|---|---|
Tesi dottorato Altilio
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
13.48 MB
Formato
Adobe PDF
|
13.48 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.