Catalogo dei prodotti della ricerca

The bandits with knapsacks (BwK) framework models online decision-making problems in which an agent makes a sequence of decisions subject to resource consumption constraints. The traditional model assumes that each action consumes a non-negative amount of resources and the process ends when the initial budgets are fully depleted. We study a natural generalization of the BwK framework which allows non-monotonic resource utilization, i.e., resources can be replenished by a positive amount. We propose a best-of-both-worlds primal-dual template that can handle any online learning problem with replenishment for which a suitable primal regret minimizer exists. In particular, we provide the first positive results for the case of adversarial inputs by showing that our framework guarantees a constant competitive ratio α when B = Ω(T) or when the possible per-round replenishment is a positive constant. Moreover, under a stochastic input model, our algorithm yields an instance-independent Õ(T1/2) regret bound which complements existing instance-dependent bounds for the same setting. Finally, we provide applications of our framework to some economic problems of practical relevance.

Bandits with Replenishable Knapsacks: the Best of both Worlds / Bernasconi, M.; Castiglioni, M.; Celli, A.; Fusco, F.. - (2024). ( 12th International Conference on Learning Representations, ICLR 2024 Vienna ).

Bandits with Replenishable Knapsacks: the Best of both Worlds

Bernasconi M.;Castiglioni M.;Celli A.;Fusco F.

2024

Abstract

The bandits with knapsacks (BwK) framework models online decision-making problems in which an agent makes a sequence of decisions subject to resource consumption constraints. The traditional model assumes that each action consumes a non-negative amount of resources and the process ends when the initial budgets are fully depleted. We study a natural generalization of the BwK framework which allows non-monotonic resource utilization, i.e., resources can be replenished by a positive amount. We propose a best-of-both-worlds primal-dual template that can handle any online learning problem with replenishment for which a suitable primal regret minimizer exists. In particular, we provide the first positive results for the case of adversarial inputs by showing that our framework guarantees a constant competitive ratio α when B = Ω(T) or when the possible per-round replenishment is a positive constant. Moreover, under a stochastic input model, our algorithm yields an instance-independent Õ(T1/2) regret bound which complements existing instance-dependent bounds for the same setting. Finally, we provide applications of our framework to some economic problems of practical relevance.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Nome convegno
	
				12th International Conference on Learning Representations, ICLR 2024
			
	Parole chiave
	
				Bandits with Knapsack; online learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Bandits with Replenishable Knapsacks: the Best of both Worlds / Bernasconi, M.; Castiglioni, M.; Celli, A.; Fusco, F.. - (2024). ( 12th International Conference on Learning Representations, ICLR 2024 Vienna ).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Bernasconi_Bandits-with_Replenishabl_2024.pdf accesso aperto Note: https://openreview.net/forum?id=yBIJRIYTqa&nesting=2&sort=date-desc Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 379.49 kB Formato Adobe PDF	379.49 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717196

Citazioni

ND

5

ND

social impact