Catalogo dei prodotti della ricerca

The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge. We prove nearly optimal regret bounds of order (Equation presented) (ignoring logarithmic factors), where αε and δε are graph-theoretic quantities measured on the support of the stochastic feedback graph G with edge probabilities thresholded at ε. Our result, which holds without any preliminary knowledge about G, requires the learner to observe only the realized out-neighborhood of the chosen action. When the learner is allowed to observe the realization of the entire graph (but only the losses in the out-neighborhood of the chosen action), we derive a more efficient algorithm featuring a dependence on weighted versions of the independence and weak domination numbers that exhibits improved bounds for some special cases.

Learning on the Edge: Online Learning with Stochastic Feedback Graphs / Esposito, E.; van der Hoeven, D.; Fusco, F.; Cesa-Bianchi, N.. - 35:(2022). (Intervento presentato al convegno Advances in Neural Information Processing Systems (was NIPS) tenutosi a New Orleans; USA).

Learning on the Edge: Online Learning with Stochastic Feedback Graphs

Esposito E.;van der Hoeven D.;Fusco F.;Cesa-Bianchi N.

2022

Abstract

The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge. We prove nearly optimal regret bounds of order (Equation presented) (ignoring logarithmic factors), where αε and δε are graph-theoretic quantities measured on the support of the stochastic feedback graph G with edge probabilities thresholded at ε. Our result, which holds without any preliminary knowledge about G, requires the learner to observe only the realized out-neighborhood of the chosen action. When the learner is allowed to observe the realization of the entire graph (but only the losses in the out-neighborhood of the chosen action), we derive a more efficient algorithm featuring a dependence on weighted versions of the independence and weak domination numbers that exhibits improved bounds for some special cases.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				Advances in Neural Information Processing Systems (was NIPS)
			
	Parole chiave
	
				feedback graph; online learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Learning on the Edge: Online Learning with Stochastic Feedback Graphs / Esposito, E.; van der Hoeven, D.; Fusco, F.; Cesa-Bianchi, N.. - 35:(2022). (Intervento presentato al  convegno Advances in Neural Information Processing Systems (was NIPS) tenutosi a New Orleans; USA).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Esposito_preprint_Learning_2022.pdf accesso aperto Note: https://proceedings.neurips.cc/paper_files/paper/2022/file/e0e956681b04ac126679e8c7dd706b2e-Paper-Conference.pdf Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 574.14 kB Formato Adobe PDF	574.14 kB	Adobe PDF
Esposito_Learning_2022.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 490.32 kB Formato Adobe PDF	490.32 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1684842

Citazioni

ND

10

0

social impact