We investigate the different roles played by nodes’ network and non-network attributes in explaining the formation of European university collaborations from 2011 to 2016, in three European Research Council (ERC) domains: Social Sciences and Humanities (SSH), Physical and Engineering Sciences (PE), Life Sciences (LS), as well as multidisciplinary collaborations. On link formation in collaboration networks, existing research has not yet compared and simultaneously examined both network and non-network attributes. Using four machine learning predictive algorithms (LASSO, Neural Network, Gradient Boosting, and Random Forest) our results show that, over various model specifications: (i) best model link formation accuracy is larger than 80%, (ii) among the non-network attributes, public funding plays an important role in PE and LS, (iii) network attributes count more than non-network attributes for the formation, sensibly increasing accuracy, (iv) feature-importance scores show a different ordering in the four domains, thus signalling different modes of knowledge production and transmission taking place within these different scientific communities.
Machine learning prediction of academic collaboration networks / Resce, G.; Zinilli, A.; Cerulli, G.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 12:1(2022). [10.1038/s41598-022-26531-1]
Machine learning prediction of academic collaboration networks
Zinilli A.Co-primo
Membro del Collaboration Group
;Cerulli G.Co-primo
Membro del Collaboration Group
2022
Abstract
We investigate the different roles played by nodes’ network and non-network attributes in explaining the formation of European university collaborations from 2011 to 2016, in three European Research Council (ERC) domains: Social Sciences and Humanities (SSH), Physical and Engineering Sciences (PE), Life Sciences (LS), as well as multidisciplinary collaborations. On link formation in collaboration networks, existing research has not yet compared and simultaneously examined both network and non-network attributes. Using four machine learning predictive algorithms (LASSO, Neural Network, Gradient Boosting, and Random Forest) our results show that, over various model specifications: (i) best model link formation accuracy is larger than 80%, (ii) among the non-network attributes, public funding plays an important role in PE and LS, (iii) network attributes count more than non-network attributes for the formation, sensibly increasing accuracy, (iv) feature-importance scores show a different ordering in the four domains, thus signalling different modes of knowledge production and transmission taking place within these different scientific communities.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.