A continuing effort is devoted to devising effective dispatching policies for clusters of First Come First Served servers. Although the optimal solution for dispatchers aware of both job size and server state remains elusive, lower bounds and strong heuristics are known. In this paper, we introduce a two-stage cluster architecture that applies classical Round Robin, Join Idle Queue, and Least Work Left dispatching schemes, coupled with an optimized service-time threshold to separate large jobs from shorter ones. Using both synthetic (Weibull) workloads and real Google data center traces, we demonstrate that our two-stage approach greatly improves upon the corresponding single-stage policies and closely approaches the performance of advanced sizeand state-aware methods. Our results highlight that careful architectural design—rather than increased complexity at the dispatcher—can yield significantly better mean response times in large-scale computing environments.

“Two-Stagification”: Job Dispatching in Large-Scale Clusters via a Two-Stage Architecture / Yildiz, Mert; Rolich, Alexey; Baiocchi, Andrea. - (2025). ( 23rd Mediterranean Communication and Computer Networking Conference (MedComNet 2025) Cagliari, Italy ) [10.1109/MedComNet65822.2025.11103543].

“Two-Stagification”: Job Dispatching in Large-Scale Clusters via a Two-Stage Architecture

mert yildiz
Primo
;
alexey rolich;andrea baiocchi
2025

Abstract

A continuing effort is devoted to devising effective dispatching policies for clusters of First Come First Served servers. Although the optimal solution for dispatchers aware of both job size and server state remains elusive, lower bounds and strong heuristics are known. In this paper, we introduce a two-stage cluster architecture that applies classical Round Robin, Join Idle Queue, and Least Work Left dispatching schemes, coupled with an optimized service-time threshold to separate large jobs from shorter ones. Using both synthetic (Weibull) workloads and real Google data center traces, we demonstrate that our two-stage approach greatly improves upon the corresponding single-stage policies and closely approaches the performance of advanced sizeand state-aware methods. Our results highlight that careful architectural design—rather than increased complexity at the dispatcher—can yield significantly better mean response times in large-scale computing environments.
2025
23rd Mediterranean Communication and Computer Networking Conference (MedComNet 2025)
Data centers; Dispatching; Scheduling; Multiple parallel servers; Real-world workload; Large-Scale multiserver system; Workload traffic measurements
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
“Two-Stagification”: Job Dispatching in Large-Scale Clusters via a Two-Stage Architecture / Yildiz, Mert; Rolich, Alexey; Baiocchi, Andrea. - (2025). ( 23rd Mediterranean Communication and Computer Networking Conference (MedComNet 2025) Cagliari, Italy ) [10.1109/MedComNet65822.2025.11103543].
File allegati a questo prodotto
File Dimensione Formato  
Yildiz_Two-Stagification_2025.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 382.01 kB
Formato Adobe PDF
382.01 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1741139
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact