The issue of defect detection is particularly important namely in plant engineering, where it is crucial to ensure high-quality production by minimizing the number of defective parts. In the last years, the interest in the subject has grown a lot and the methods and approaches proposed for defect recognition are multiple. Therefore, when dealing with defect recognition researchers are faced with an increasing number of articles that slows them down in identifying the set of articles of their interest. This work aims to provide a baseline classification of articles based on emerging issues such as the investigated material, the production typology in which the material is included, and the type of analysis to be effected. For these reasons, the paper proposes an automatic solution based on text mining techniques. Specifically, the study applies Natural Language Processing (NLP) to articles’ titles, abstracts, and keywords using two different approaches: K-Means clustering algorithm and Latent Dirichlet Allocation (LDA). K-Means is used to cluster the collection of documents into related groups based on the contents of the particular documents. LDA instead is used to classify the papers using the concept of topic modeling. Articles have been collected from Scopus database. The scope of the research is limited to journal and conference articles, published in English excluding articles classified as reviews, as well as book chapters, books, notes, erratum.

Using Natural Language Processing to uncover main topics in defect recognition literature / Bernabei, M.; Colabianchi, S.; Costantino, F.; Patriarca, R.. - In: ...SUMMER SCHOOL FRANCESCO TURCO. PROCEEDINGS. - ISSN 2283-8996. - (2021). (Intervento presentato al convegno 26th Summer School Francesco Turco, 2021 tenutosi a Virtual, Online).

Using Natural Language Processing to uncover main topics in defect recognition literature

Bernabei M.;Colabianchi S.
;
Costantino F.;Patriarca R.
2021

Abstract

The issue of defect detection is particularly important namely in plant engineering, where it is crucial to ensure high-quality production by minimizing the number of defective parts. In the last years, the interest in the subject has grown a lot and the methods and approaches proposed for defect recognition are multiple. Therefore, when dealing with defect recognition researchers are faced with an increasing number of articles that slows them down in identifying the set of articles of their interest. This work aims to provide a baseline classification of articles based on emerging issues such as the investigated material, the production typology in which the material is included, and the type of analysis to be effected. For these reasons, the paper proposes an automatic solution based on text mining techniques. Specifically, the study applies Natural Language Processing (NLP) to articles’ titles, abstracts, and keywords using two different approaches: K-Means clustering algorithm and Latent Dirichlet Allocation (LDA). K-Means is used to cluster the collection of documents into related groups based on the contents of the particular documents. LDA instead is used to classify the papers using the concept of topic modeling. Articles have been collected from Scopus database. The scope of the research is limited to journal and conference articles, published in English excluding articles classified as reviews, as well as book chapters, books, notes, erratum.
2021
26th Summer School Francesco Turco, 2021
Clustering; Information retrieval (IR); K-Means; LDA; Manufacturing
04 Pubblicazione in atti di convegno::04c Atto di convegno in rivista
Using Natural Language Processing to uncover main topics in defect recognition literature / Bernabei, M.; Colabianchi, S.; Costantino, F.; Patriarca, R.. - In: ...SUMMER SCHOOL FRANCESCO TURCO. PROCEEDINGS. - ISSN 2283-8996. - (2021). (Intervento presentato al convegno 26th Summer School Francesco Turco, 2021 tenutosi a Virtual, Online).
File allegati a questo prodotto
File Dimensione Formato  
Barnabei_Using-Natural_2021.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 277.78 kB
Formato Adobe PDF
277.78 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1616776
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact