The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are “Speed greater than 60 km h” and “Did not see other people until it was too late”. A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.
A cluster analysis on road traffic accidents using genetic algorithms / Sabariah, Saharan; Baragona, Roberto. - STAMPA. - 1830, issue 1:(2017), pp. 1695-1702. (Intervento presentato al convegno The 4th International Conference on Mathematical Sciences tenutosi a Langkawi, Malaysia nel 4th to 5th May 2017) [doi: http://dx.doi.org/10.1063/1.4980927].
A cluster analysis on road traffic accidents using genetic algorithms
BARAGONA, Roberto
2017
Abstract
The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are “Speed greater than 60 km h” and “Did not see other people until it was too late”. A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.