In many applications that rely on machine learning, the availability of labelled data is a matter of primary importance. However, when tackling new tasks, labels are usually missing and must be collected from scratch by the users. In this work, we address the problem of learning classifiers when the amount of labels is very scarce. We do so by learning multiple vectors, called prototypes, that represent relevant semantic concepts for the task at hand. We propose a theoretically inspired mechanism that computes probabilities of matching between the prototypes and the input elements, and we combine these probabilities to increase the expressiveness of the classifier. Moreover, by leveraging low-cost extra annotations in the training data, a simple error-boosting technique guides the learning process and provides substantial performance improvements. Empirical results confirm the benefits of the proposed approach in both balanced and unbalanced datasets. Our methodology is thus of practical use when gathering and labelling new examples is more expensive than annotating what we already have.

Concept Matching for Low-Resource Classification / Errica, Federico; Silvestri, Fabrizio; Edizel, Bora; Denoyer, Ludovic; Petroni, Fabio; Plachouras, Vassilis; Riedel, Sebastian. - (2021), pp. 1-8. (Intervento presentato al convegno International Joint Conference on Neural Networks, IJCNN 2021 tenutosi a Shenzhen, China) [10.1109/IJCNN52387.2021.9533640].

Concept Matching for Low-Resource Classification

Fabrizio Silvestri;
2021

Abstract

In many applications that rely on machine learning, the availability of labelled data is a matter of primary importance. However, when tackling new tasks, labels are usually missing and must be collected from scratch by the users. In this work, we address the problem of learning classifiers when the amount of labels is very scarce. We do so by learning multiple vectors, called prototypes, that represent relevant semantic concepts for the task at hand. We propose a theoretically inspired mechanism that computes probabilities of matching between the prototypes and the input elements, and we combine these probabilities to increase the expressiveness of the classifier. Moreover, by leveraging low-cost extra annotations in the training data, a simple error-boosting technique guides the learning process and provides substantial performance improvements. Empirical results confirm the benefits of the proposed approach in both balanced and unbalanced datasets. Our methodology is thus of practical use when gathering and labelling new examples is more expensive than annotating what we already have.
2021
International Joint Conference on Neural Networks, IJCNN 2021
NN, Low Resource
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Concept Matching for Low-Resource Classification / Errica, Federico; Silvestri, Fabrizio; Edizel, Bora; Denoyer, Ludovic; Petroni, Fabio; Plachouras, Vassilis; Riedel, Sebastian. - (2021), pp. 1-8. (Intervento presentato al convegno International Joint Conference on Neural Networks, IJCNN 2021 tenutosi a Shenzhen, China) [10.1109/IJCNN52387.2021.9533640].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1573127
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact