Distributed stream processing systems are today gaining momentum as a tool to perform analytics on continuous data streams. Load shedding is a technique used to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. In this paper, we propose Load-Aware Shedding (LAS), a novel load shedding solution that, unlike previous works, does not rely neither on a pre-defined cost model nor on any assumption on the tuple execution duration. Leveraging sketches, LAS efficiently estimates the execution duration of each tuple with small error bounds and uses this knowledge to proactively shed input streams at any operator to limiting queuing latencies while dropping as few tuples as possible. We provide a theoretical analysis proving that LAS is an (ε, δ) -approximation of the optimal online load shedder. Furthermore, through an extensive practical evaluation based on simulations and a prototype, we evaluate its impact on stream processing applications.

Load-Aware Shedding in Stream Processing Systems / Rivetti, Nicolò; Busnel, Yann; Querzoni, Leonardo. - (2020), pp. 121-153. - LECTURE NOTES IN COMPUTER SCIENCE. [10.1007/978-3-662-62386-2_5].

Load-Aware Shedding in Stream Processing Systems

Querzoni, Leonardo
2020

Abstract

Distributed stream processing systems are today gaining momentum as a tool to perform analytics on continuous data streams. Load shedding is a technique used to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. In this paper, we propose Load-Aware Shedding (LAS), a novel load shedding solution that, unlike previous works, does not rely neither on a pre-defined cost model nor on any assumption on the tuple execution duration. Leveraging sketches, LAS efficiently estimates the execution duration of each tuple with small error bounds and uses this knowledge to proactively shed input streams at any operator to limiting queuing latencies while dropping as few tuples as possible. We provide a theoretical analysis proving that LAS is an (ε, δ) -approximation of the optimal online load shedder. Furthermore, through an extensive practical evaluation based on simulations and a prototype, we evaluate its impact on stream processing applications.
2020
Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVI
978-3-662-62385-5
978-3-662-62386-2
Data streaming; Distributed systems; Load-shedding; Stream processing
02 Pubblicazione su volume::02a Capitolo o Articolo
Load-Aware Shedding in Stream Processing Systems / Rivetti, Nicolò; Busnel, Yann; Querzoni, Leonardo. - (2020), pp. 121-153. - LECTURE NOTES IN COMPUTER SCIENCE. [10.1007/978-3-662-62386-2_5].
File allegati a questo prodotto
File Dimensione Formato  
Rivetti_Load-Aware_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.73 MB
Formato Adobe PDF
2.73 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1477119
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact