Modern stream processing engines (SPEs) process large volumes of events propagated at high velocity through multiple queries. To improve performance, existing SPEs generally aim to minimize query output latency by minimizing, in turn, the propagation delay of events in query pipelines. However, for queries containing commonly used blocking operators such as windows, this scheduling approach can be inefficient. Watermarks are events popularly utilized by SPEs to correctly process window operators. Watermarks are injected into the stream to signify that no events preceding their timestamp should be further expected. Through the design and development of Klink, we leverage these watermarks to robustly infer stream progress based on window deadlines and network delay, and to schedule query pipeline execution that reflects stream progress. Klink aims to unblock window operators and to rapidly propagate events to output operators while performing judicious memory management. We integrate Klink into the popular open source SPE Apache Flink and demonstrate that Klink delivers significant performance gains over existing scheduling policies on benchmark workloads for both scale-up and scale-out deployments.

Klink: progress-aware scheduling for streaming data systems / Farhat, Omar; Daudjee, Khuzaima; Querzoni, Leonardo. - (2021), pp. 485-498. (Intervento presentato al convegno ACM Special Interest Group on Management of Data Conference tenutosi a Xi'an, Shaanxi; China) [10.1145/3448016.3452794].

Klink: progress-aware scheduling for streaming data systems

Querzoni, Leonardo
2021

Abstract

Modern stream processing engines (SPEs) process large volumes of events propagated at high velocity through multiple queries. To improve performance, existing SPEs generally aim to minimize query output latency by minimizing, in turn, the propagation delay of events in query pipelines. However, for queries containing commonly used blocking operators such as windows, this scheduling approach can be inefficient. Watermarks are events popularly utilized by SPEs to correctly process window operators. Watermarks are injected into the stream to signify that no events preceding their timestamp should be further expected. Through the design and development of Klink, we leverage these watermarks to robustly infer stream progress based on window deadlines and network delay, and to schedule query pipeline execution that reflects stream progress. Klink aims to unblock window operators and to rapidly propagate events to output operators while performing judicious memory management. We integrate Klink into the popular open source SPE Apache Flink and demonstrate that Klink delivers significant performance gains over existing scheduling policies on benchmark workloads for both scale-up and scale-out deployments.
2021
ACM Special Interest Group on Management of Data Conference
stream processing; scheduling; distributed systems
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Klink: progress-aware scheduling for streaming data systems / Farhat, Omar; Daudjee, Khuzaima; Querzoni, Leonardo. - (2021), pp. 485-498. (Intervento presentato al convegno ACM Special Interest Group on Management of Data Conference tenutosi a Xi'an, Shaanxi; China) [10.1145/3448016.3452794].
File allegati a questo prodotto
File Dimensione Formato  
Farhat_Klink_2021.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.79 MB
Formato Adobe PDF
1.79 MB Adobe PDF   Contatta l'autore
Farhat_postprint_Klink_2021.pdf.pdf

accesso aperto

Note: https://doi.org/10.1145/3448016.3452794
Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 986.64 kB
Formato Adobe PDF
986.64 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1555552
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 2
social impact