Positive-Unlabelled (PU) learning is the machine learning setting in which only a set of positive instances are labelled, while the rest of the data set is unlabelled. The unlabelled instances may be either unspecified positive samples or true negative samples. Over the years, many solutions have been proposed to deal with PU learning. Some techniques consider the unlabelled samples as negative ones, reducing the problem to a binary classification with a noisy negative set, while others aim to detect sets of possible negative examples to later apply a supervised machine learning strategy (two-step techniques). The approach proposed in this work falls in the latter category and works in a semi-supervised fashion: motivated and inspired by previous works, a Markov diffusion process with restart is used to assign pseudo-labels to unlabelled instances. Afterward, a machine learning model, exploiting the newly assigned classes, is trained. The principal aim of the algorithm is to identify a set of instances which are likely to contain positive instances that were originally unlabelled.

Adaptive Positive-Unlabelled Learning via Markov Diffusion / Stolfi, Paola; Mastropietro, Andrea; Pasculli, Giuseppe; Tieri, Paolo; Vergni, Davide. - (2021).

Adaptive Positive-Unlabelled Learning via Markov Diffusion

Andrea Mastropietro
Secondo
;
Giuseppe Pasculli;
2021

Abstract

Positive-Unlabelled (PU) learning is the machine learning setting in which only a set of positive instances are labelled, while the rest of the data set is unlabelled. The unlabelled instances may be either unspecified positive samples or true negative samples. Over the years, many solutions have been proposed to deal with PU learning. Some techniques consider the unlabelled samples as negative ones, reducing the problem to a binary classification with a noisy negative set, while others aim to detect sets of possible negative examples to later apply a supervised machine learning strategy (two-step techniques). The approach proposed in this work falls in the latter category and works in a semi-supervised fashion: motivated and inspired by previous works, a Markov diffusion process with restart is used to assign pseudo-labels to unlabelled instances. Afterward, a machine learning model, exploiting the newly assigned classes, is trained. The principal aim of the algorithm is to identify a set of instances which are likely to contain positive instances that were originally unlabelled.
2021
Computer Science and Machine Learning
Computer Science - Learning; Bioinformatics
02 Pubblicazione su volume::02a Capitolo o Articolo
Adaptive Positive-Unlabelled Learning via Markov Diffusion / Stolfi, Paola; Mastropietro, Andrea; Pasculli, Giuseppe; Tieri, Paolo; Vergni, Davide. - (2021).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1565282
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact