Analyzing categorical data in machine learning generally requires a coding strategy. This problem is common to multivariate statistical techniques, and several approaches have been suggested in the literature. This article proposes a method for analyzing categorical variables with neural networks. Both a supervised and unsupervised approach were considered, in which the variables can have high cardinality. Some simulated data applications illustrate the interest in the proposal.

Optimal coding of high-cardinality categorical data in machine learning / DI CIACCIO, Agostino. - (2023), pp. 39-51. - STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION. [10.1007/978-3-031-30164-3].

Optimal coding of high-cardinality categorical data in machine learning

Agostino Di Ciaccio
Primo
2023

Abstract

Analyzing categorical data in machine learning generally requires a coding strategy. This problem is common to multivariate statistical techniques, and several approaches have been suggested in the literature. This article proposes a method for analyzing categorical variables with neural networks. Both a supervised and unsupervised approach were considered, in which the variables can have high cardinality. Some simulated data applications illustrate the interest in the proposal.
2023
Statistical Models and Methods for Data Science
978-3-031-30163-6
encoding categorical data; neural networks; high cardinality attributes; optimal scaling
02 Pubblicazione su volume::02a Capitolo o Articolo
Optimal coding of high-cardinality categorical data in machine learning / DI CIACCIO, Agostino. - (2023), pp. 39-51. - STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION. [10.1007/978-3-031-30164-3].
File allegati a questo prodotto
File Dimensione Formato  
Di Ciaccio_optimal-coding_2023.pdf

solo gestori archivio

Note: Articolo stampato
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 7.54 MB
Formato Adobe PDF
7.54 MB Adobe PDF   Contatta l'autore
Di Ciaccio_copertina_optimal-coding_2023.pdf

accesso aperto

Note: Copertina del volume
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.86 MB
Formato Adobe PDF
1.86 MB Adobe PDF
Di Ciaccio_quarta_optimal-coding_2023.pdf

accesso aperto

Note: Ultima pagina del volume
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 399.92 kB
Formato Adobe PDF
399.92 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1685751
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact