Dispatching policies shape delay and throughput in multi-server data centers, yet the fidelity of classical queueing models under production workloads remains unclear. We combine analytical modeling with trace-driven simulation to reassess Round Robin (RR), Join-Idle-Queue (JIQ), and Least-Work-Left (LWL) using job-level and task-level views of Google ClusterData v3 and Alibaba Cluster Trace v2018. Under controlled Poisson arrivals with Weibull service times, the analytical models match the simulation closely. We then examine model-trace discrepancies through controlled manipulations: shuffling inter-arrival times, replacing arrivals with a Poisson process, shuffling task Central Processing Unit (CPU) times, and trimming the top 0.1% of service demands. Hidden dependence and rare very large jobs explain most gaps; when both sequences are randomized and outliers removed, job-level predictions align with simulation. At the task level, where jobs decompose into independently dispatched tasks, policy ordering may change: in a production trace case, JIQ often matches or surpasses LWL, while RR remains weakest. We also introduce a simple analytical approximation for JIQ that is easy to evaluate and accurate in the controlled setting. Overall, the study clarifies when analytical models hold, identifies workload features that break them, and informs dispatcher choice under production conditions.

Dispatching policies in data center clusters: Insights from Google and Alibaba workloads / Yildiz, Mert; Rolich, Alexey; Baiocchi, Andrea. - In: PERFORMANCE EVALUATION. - ISSN 0166-5316. - 172:(2026), pp. 1-18. [10.1016/j.peva.2026.102551]

Dispatching policies in data center clusters: Insights from Google and Alibaba workloads

Yildiz, Mert;Rolich, Alexey;Baiocchi, Andrea
2026

Abstract

Dispatching policies shape delay and throughput in multi-server data centers, yet the fidelity of classical queueing models under production workloads remains unclear. We combine analytical modeling with trace-driven simulation to reassess Round Robin (RR), Join-Idle-Queue (JIQ), and Least-Work-Left (LWL) using job-level and task-level views of Google ClusterData v3 and Alibaba Cluster Trace v2018. Under controlled Poisson arrivals with Weibull service times, the analytical models match the simulation closely. We then examine model-trace discrepancies through controlled manipulations: shuffling inter-arrival times, replacing arrivals with a Poisson process, shuffling task Central Processing Unit (CPU) times, and trimming the top 0.1% of service demands. Hidden dependence and rare very large jobs explain most gaps; when both sequences are randomized and outliers removed, job-level predictions align with simulation. At the task level, where jobs decompose into independently dispatched tasks, policy ordering may change: in a production trace case, JIQ often matches or surpasses LWL, while RR remains weakest. We also introduce a simple analytical approximation for JIQ that is easy to evaluate and accurate in the controlled setting. Overall, the study clarifies when analytical models hold, identifies workload features that break them, and informs dispatcher choice under production conditions.
2026
dispatching; scheduling; data center; realistic workload; cloud computing; parallel scheduling; join-Idle-Queue
01 Pubblicazione su rivista::01a Articolo in rivista
Dispatching policies in data center clusters: Insights from Google and Alibaba workloads / Yildiz, Mert; Rolich, Alexey; Baiocchi, Andrea. - In: PERFORMANCE EVALUATION. - ISSN 0166-5316. - 172:(2026), pp. 1-18. [10.1016/j.peva.2026.102551]
File allegati a questo prodotto
File Dimensione Formato  
Yildiz_dispatching policies_2026.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.62 MB
Formato Adobe PDF
1.62 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1760538
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact