Data analysis in high energy physics often deals with data samples consisting of a mixture of signal and background events. The sPlot technique is a common method to subtract the contribution of the background by assigning weights to events. Part of the weights are by design negative. Negative weights lead to the divergence of some machine learning algorithms training due to absence of the lower bound in the loss function. In this paper we propose a mathematically rigorous way to train machine learning algorithms on data samples with background described by sPlot to obtain signal probabilities conditioned on observables, without encountering negative event weight at all. This allows usage of any out-of-the-box machine learning methods on such data.

Machine Learning on data with sPlot background subtraction / Borisyak, M.; Kazeev, N.. - In: JOURNAL OF INSTRUMENTATION. - ISSN 1748-0221. - 14:8(2019). [10.1088/1748-0221/14/08/P08020]

Machine Learning on data with sPlot background subtraction

Kazeev N.
2019

Abstract

Data analysis in high energy physics often deals with data samples consisting of a mixture of signal and background events. The sPlot technique is a common method to subtract the contribution of the background by assigning weights to events. Part of the weights are by design negative. Negative weights lead to the divergence of some machine learning algorithms training due to absence of the lower bound in the loss function. In this paper we propose a mathematically rigorous way to train machine learning algorithms on data samples with background described by sPlot to obtain signal probabilities conditioned on observables, without encountering negative event weight at all. This allows usage of any out-of-the-box machine learning methods on such data.
2019
analysis and statistical methods; data processing methods; pattern recognition; cluster finding; calibration and fitting methods
01 Pubblicazione su rivista::01a Articolo in rivista
Machine Learning on data with sPlot background subtraction / Borisyak, M.; Kazeev, N.. - In: JOURNAL OF INSTRUMENTATION. - ISSN 1748-0221. - 14:8(2019). [10.1088/1748-0221/14/08/P08020]
File allegati a questo prodotto
File Dimensione Formato  
Borisyak_preprint_Machine-learning_2019.pdf

accesso aperto

Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 331.03 kB
Formato Adobe PDF
331.03 kB Adobe PDF
Borisyak_Machine-learning_2019.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 743.35 kB
Formato Adobe PDF
743.35 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1344909
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 7
social impact