Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.

Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction / Di Noia, Antonio.; Martino, Alessio.; Montanari, Paolo.; Rizzi, Antonello. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:6(2020), pp. 4393-4406. [10.1007/s00500-019-04200-2]

Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction

Di Noia Antonio.;Martino Alessio.
;
Rizzi Antonello
2020

Abstract

Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.
2020
cluster analysis; computational intelligence; occupational diseases risk prediction; pattern recognition; predictive medicine; support vector machine
01 Pubblicazione su rivista::01a Articolo in rivista
Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction / Di Noia, Antonio.; Martino, Alessio.; Montanari, Paolo.; Rizzi, Antonello. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:6(2020), pp. 4393-4406. [10.1007/s00500-019-04200-2]
File allegati a questo prodotto
File Dimensione Formato  
Di Noia_Supervised-machine_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 773.63 kB
Formato Adobe PDF
773.63 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1301429
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? 17
social impact