Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.
Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction / Di Noia, Antonio.; Martino, Alessio.; Montanari, Paolo.; Rizzi, Antonello. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:6(2020), pp. 4393-4406. [10.1007/s00500-019-04200-2]
Supervised machine learning techniques and genetic optimization for occupational diseases risk prediction
Di Noia Antonio.;Martino Alessio.
;Rizzi Antonello
2020
Abstract
Workers healthcare gained a lot of attention recently as many countries are increasingly concerning about welfare. This paper faces the problem of predicting occupational disease risks by means of computational intelligence and pattern recognition techniques. Specifically, three different machine learning approaches are compared: the first one is based on the k-means algorithm, in charge to determine a set of meaningful labelled clusters as the final model. The latter two are based on fully supervised techniques, namely Support Vector Machines and K-Nearest Neighbours. Real data regarding both the worker and the workplace by mixing numerical and categorical attributes have been used for testing. The three approaches are automatically tuned by means of genetic algorithms in order to simultaneously find the optimal hyperparameters for the classification systems and the optimal ad-hoc dissimilarity measure weights in order to maximize the classification performances. Computational results show that the three approaches are rather comparable in terms of performances, but a clustering-based approach allows a deeper knowledge discovery phase, helpful for further risk assessment and forecasting.File | Dimensione | Formato | |
---|---|---|---|
Di Noia_Supervised-machine_2020.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
773.63 kB
Formato
Adobe PDF
|
773.63 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.