Cloud computing and AI workloads are driving unprecedented demand for efficient communication within and across datacenters. However, the coexistence of intra- and inter-datacenter traffic within datacenters plus the disparity between the RTTs of intra- and inter-datacenter networks complicates congestion management and traffic routing. Particularly, faster congestion responses of intra-datacenter traffic causes rate unfairness when competing with slower inter-datacenter flows. Additionally, inter-datacenter messages suffer slow loss recovery and, thus, require reliability. Alas, existing solutions overlook these challenges and handle inter- and intra-datacenter congestion with separate control loops or at different granularities. We propose Uno, a unified system for both inter- and intra-DC environments that integrates a transport protocol for rapid congestion reaction and fair rate control with a load balancing scheme that combines erasure coding and adaptive routing. Our findings show that Uno significantly improves the completion times of both inter- and intra-DC flows compared to state-of-the-art methods such as Gemini.
Uno: A One-Stop Solution for Inter- and Intra-Data Center Congestion Control and Reliable Connectivity / Bonato, Tommaso; Abdous, Sepehr; Kabbani, Abdul; Ahmad Ghalayini, And; Gebara, Nadeen; Lam, Terry; Agarwal, Anup; Chen, Tiancheng; Yu, Zhuolong; Taranov, Konstantin; Elhaddad, Mahmoud; De Sensi, Daniele; Ghorbani, Soudeh; Hoefler, Torsten. - (2025). (Intervento presentato al convegno International Conference for High Performance Computing, Networking, Storage and Analysis (was Supercomputing Conference) tenutosi a St. Louis, USA).
Uno: A One-Stop Solution for Inter- and Intra-Data Center Congestion Control and Reliable Connectivity
Daniele De Sensi;
2025
Abstract
Cloud computing and AI workloads are driving unprecedented demand for efficient communication within and across datacenters. However, the coexistence of intra- and inter-datacenter traffic within datacenters plus the disparity between the RTTs of intra- and inter-datacenter networks complicates congestion management and traffic routing. Particularly, faster congestion responses of intra-datacenter traffic causes rate unfairness when competing with slower inter-datacenter flows. Additionally, inter-datacenter messages suffer slow loss recovery and, thus, require reliability. Alas, existing solutions overlook these challenges and handle inter- and intra-datacenter congestion with separate control loops or at different granularities. We propose Uno, a unified system for both inter- and intra-DC environments that integrates a transport protocol for rapid congestion reaction and fair rate control with a load balancing scheme that combines erasure coding and adaptive routing. Our findings show that Uno significantly improves the completion times of both inter- and intra-DC flows compared to state-of-the-art methods such as Gemini.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


