The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex social cost function, given by the sum of local (non-convex) costs, where each agent contributes with a single error term defined with respect to its local dataset. To devise a flexible and efficient solution, we customize a recently proposed framework for non-convex optimization over networks, which hinges on a (primal) convexification-decomposition technique to handle non-convexity, and a dynamic consensus procedure to diffuse information among the agents. Several typical choices for the training criterion (e.g., squared loss, cross entropy, etc.) and regularization (e.g., l2 norm, sparsity inducing penalties, etc.) are included in the framework and explored along the paper. Convergence to a stationary solution of the social non-convex problem is guaranteed under mild assumptions. Additionally, we show a principled way allowing each agent to exploit a possible multi-core architecture (e.g., a local cloud) in order to parallelize its local optimization step, resulting in strategies that are both distributed (across the agents) and parallel (inside each agent) in nature. A comprehensive set of experimental results validate the proposed approach.

A framework for parallel and distributed training of neural networks / Scardapane, Simone; DI LORENZO, Paolo. - In: NEURAL NETWORKS. - ISSN 0893-6080. - STAMPA. - 91:(2017), pp. 42-54. [10.1016/j.neunet.2017.04.004]

A framework for parallel and distributed training of neural networks

SCARDAPANE, SIMONE;DI LORENZO, PAOLO
2017

Abstract

The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex social cost function, given by the sum of local (non-convex) costs, where each agent contributes with a single error term defined with respect to its local dataset. To devise a flexible and efficient solution, we customize a recently proposed framework for non-convex optimization over networks, which hinges on a (primal) convexification-decomposition technique to handle non-convexity, and a dynamic consensus procedure to diffuse information among the agents. Several typical choices for the training criterion (e.g., squared loss, cross entropy, etc.) and regularization (e.g., l2 norm, sparsity inducing penalties, etc.) are included in the framework and explored along the paper. Convergence to a stationary solution of the social non-convex problem is guaranteed under mild assumptions. Additionally, we show a principled way allowing each agent to exploit a possible multi-core architecture (e.g., a local cloud) in order to parallelize its local optimization step, resulting in strategies that are both distributed (across the agents) and parallel (inside each agent) in nature. A comprehensive set of experimental results validate the proposed approach.
2017
distributed learning; networks; neural network; parallel computing; cognitive neuroscience; artificial intelligence
01 Pubblicazione su rivista::01a Articolo in rivista
A framework for parallel and distributed training of neural networks / Scardapane, Simone; DI LORENZO, Paolo. - In: NEURAL NETWORKS. - ISSN 0893-6080. - STAMPA. - 91:(2017), pp. 42-54. [10.1016/j.neunet.2017.04.004]
File allegati a questo prodotto
File Dimensione Formato  
Scardapane_Framework_2017.pdf

solo utenti autorizzati

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.06 MB
Formato Adobe PDF
1.06 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/966754
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 16
social impact