Catalogo dei prodotti della ricerca

Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set of parameters with no additional training overhead. Yet, existing merging methods fall short of matching the full accuracy of separately fine-tuned endpoints. We present MASS (MoErging through Adaptive Subspace Selection), a new approach that closes this gap by unifying multiple fine-tuned models while retaining near state-of-the-art performance across tasks. Building on the low-rank decomposition of per-task updates, MASS stores only the most salient singular components for each task and merges them into a shared model. At inference time, a non-parametric, data-free router identifies which subspace (or combination thereof) best explains an input's intermediate features and activates the corresponding task-specific block. This procedure is fully training-free and introduces only a two-pass inference overhead plus a ~2 storage factor compared to a single pretrained model, irrespective of the number of tasks. We evaluate MASS on CLIP-based image classification using ViT-B-16, ViT-B-32 and ViT-L-14 for benchmarks of 8, 14 and 20 tasks respectively, establishing a new state-of-the-art. Most notably, MASS recovers up to ~98% of the average accuracy of individual fine-tuned models, making it a practical alternative to ensembling at a fraction of the storage cost.

MASS: MoErging through Adaptive Subspace Selection / Crisostomi, Donato; Zirilli, Alessandro; Gargiulo, Antonio Andrea; Bucarelli, Maria Sofia; Scardapane, Simone; Silvestri, Fabrizio; Masi, Iacopo; Rodola', Emanuele. - (2026). ( International Conference on Learning Representations (ICLR) Rio De Janeiro, Brazil ).

MASS: MoErging through Adaptive Subspace Selection

Donato Crisostomi;Alessandro Zirilli;Antonio Andrea Gargiulo;Maria Sofia Bucarelli;Simone Scardapane;Fabrizio Silvestri;Iacopo Masi;Emanuele Rodola'

2026

Abstract

Model merging has recently emerged as a lightweight alternative to ensembling, combining multiple fine-tuned models into a single set of parameters with no additional training overhead. Yet, existing merging methods fall short of matching the full accuracy of separately fine-tuned endpoints. We present MASS (MoErging through Adaptive Subspace Selection), a new approach that closes this gap by unifying multiple fine-tuned models while retaining near state-of-the-art performance across tasks. Building on the low-rank decomposition of per-task updates, MASS stores only the most salient singular components for each task and merges them into a shared model. At inference time, a non-parametric, data-free router identifies which subspace (or combination thereof) best explains an input's intermediate features and activates the corresponding task-specific block. This procedure is fully training-free and introduces only a two-pass inference overhead plus a ~2 storage factor compared to a single pretrained model, irrespective of the number of tasks. We evaluate MASS on CLIP-based image classification using ViT-B-16, ViT-B-32 and ViT-L-14 for benchmarks of 8, 14 and 20 tasks respectively, establishing a new state-of-the-art. Most notably, MASS recovers up to ~98% of the average accuracy of individual fine-tuned models, making it a practical alternative to ensembling at a fraction of the storage cost.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Nome convegno
	
				International Conference on Learning Representations (ICLR)
			
	Parole chiave
	
				transfer learning, meta learning, and lifelong learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				MASS: MoErging through Adaptive Subspace Selection / Crisostomi, Donato; Zirilli, Alessandro; Gargiulo, Antonio Andrea; Bucarelli, Maria Sofia; Scardapane, Simone; Silvestri, Fabrizio; Masi, Iacopo; Rodola', Emanuele. - (2026). ( International Conference on Learning Representations (ICLR) Rio De Janeiro, Brazil ).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Crisostomi_Crisostomi_2026.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 5.87 MB Formato Adobe PDF Contatta l'autore	5.87 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1763252

Citazioni

ND

ND

ND

social impact