A recurring task in security monitoring/anomaly detection applications consists in finding the so-called top “spreaders” (“scanners”), for instance hosts which connect to a large number of distinct destinations or hit different ports. Estimating the top k scanners, and their cardinality, using the least amount of memory meanwhile running at multi-Gbps speed, is a non trivial task, as it requires to “remember” the destinations or ports already contacted in the past by each specific host. This paper proposes and assesses an innovative design, called FlowFight. As the name implies, our approach revolves on the idea of deploying a relatively small number of per-flow HyperLogLog approximate counters — only slightly superior to the target k — and involve the potentially huge number of concurrent flows in a sort of dynamic randomized “competition” for entering such set. The algorithm has been tested and integrated in a full-fledged software router such as Vector Packet Processor. Using either synthetic or real traffic traces, we show that FlowFight is able to estimate the top-k cardinality flows with an accuracy of more than 95%, while retaining a processing throughput of around 8 Mpps on a single core. We further show that FlowFight achieves same accuracy of the state of the art competitor SpreadSketch using 10x times less memory with 1.2x times higher throughput.

FlowFight: High performance–low memory top-k spreader detection / Bruschi, V.; Pontarelli, S.; Tollet, J.; Barach, D.; Bianchi, G.. - In: COMPUTER NETWORKS. - ISSN 1389-1286. - 196:(2021), p. 108239. [10.1016/j.comnet.2021.108239]

FlowFight: High performance–low memory top-k spreader detection

Pontarelli S.;
2021

Abstract

A recurring task in security monitoring/anomaly detection applications consists in finding the so-called top “spreaders” (“scanners”), for instance hosts which connect to a large number of distinct destinations or hit different ports. Estimating the top k scanners, and their cardinality, using the least amount of memory meanwhile running at multi-Gbps speed, is a non trivial task, as it requires to “remember” the destinations or ports already contacted in the past by each specific host. This paper proposes and assesses an innovative design, called FlowFight. As the name implies, our approach revolves on the idea of deploying a relatively small number of per-flow HyperLogLog approximate counters — only slightly superior to the target k — and involve the potentially huge number of concurrent flows in a sort of dynamic randomized “competition” for entering such set. The algorithm has been tested and integrated in a full-fledged software router such as Vector Packet Processor. Using either synthetic or real traffic traces, we show that FlowFight is able to estimate the top-k cardinality flows with an accuracy of more than 95%, while retaining a processing throughput of around 8 Mpps on a single core. We further show that FlowFight achieves same accuracy of the state of the art competitor SpreadSketch using 10x times less memory with 1.2x times higher throughput.
2021
Network monitoring; Software router; Superspreaders detection
01 Pubblicazione su rivista::01a Articolo in rivista
FlowFight: High performance–low memory top-k spreader detection / Bruschi, V.; Pontarelli, S.; Tollet, J.; Barach, D.; Bianchi, G.. - In: COMPUTER NETWORKS. - ISSN 1389-1286. - 196:(2021), p. 108239. [10.1016/j.comnet.2021.108239]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1591641
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact