Catalogo dei prodotti della ricerca

The evolving nature of malware poses significant challenges for machine learning-based detectors, demanding frequent updates to handle new threats. As keeping all historical data is impractical due to storage constraints, Continual Learning (CL) algorithms come to help by incrementally updating the detectors without retraining over all previously collected data. Unfortunately, updating the model might cause inconsistencies: the new model can have false positives for goodware that was previously correctly classified, and malware that was detected by the previous model can become undetected by the new one. This issue, referred to as security regression, is often overlooked in concurrent work but can undermine user trust despite overall detection performance improvements. In this work, we address this issue by proposing a learning strategy that combines a replay-based CL method with a regression-aware penalty to preserve the correct decisions of earlier models. Specifically, we adapt the Positive Congruent Training (PCT) strategy to a CL setting, presenting the first regression-aware CL algorithm. Experiments conducted on the ELSA Android dataset demonstrate how this approach significantly reduces security regression while keeping up with the data drift, maintaining high detection performances over time.

Understanding Regression in Continual Learning for Malware Detection / Ghiani, D.; Angioni, D.; Sotgiu, A.; Pintor, M.; Biggio, B.. - 3962:(2025). ( ITASEC 25 Bologna ).

Understanding Regression in Continual Learning for Malware Detection

Ghiani D.^{Primo

Formal Analysis};Angioni D.^{Conceptualization};Sotgiu A.^Supervision;Pintor M.^Supervision;

2025

Abstract

The evolving nature of malware poses significant challenges for machine learning-based detectors, demanding frequent updates to handle new threats. As keeping all historical data is impractical due to storage constraints, Continual Learning (CL) algorithms come to help by incrementally updating the detectors without retraining over all previously collected data. Unfortunately, updating the model might cause inconsistencies: the new model can have false positives for goodware that was previously correctly classified, and malware that was detected by the previous model can become undetected by the new one. This issue, referred to as security regression, is often overlooked in concurrent work but can undermine user trust despite overall detection performance improvements. In this work, we address this issue by proposing a learning strategy that combines a replay-based CL method with a regression-aware penalty to preserve the correct decisions of earlier models. Specifically, we adapt the Positive Congruent Training (PCT) strategy to a CL setting, presenting the first regression-aware CL algorithm. Experiments conducted on the ELSA Android dataset demonstrate how this approach significantly reduces security regression while keeping up with the data drift, maintaining high detection performances over time.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2025
			
	Nome convegno
	
				ITASEC 25
			
	Parole chiave
	
				Android Malware, Continual Learning, Negative Flips, Regression Testing
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Understanding Regression in Continual Learning for Malware Detection / Ghiani, D.; Angioni, D.; Sotgiu, A.; Pintor, M.; Biggio, B.. - 3962:(2025). ( ITASEC 25 Bologna ).

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1754996

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact