The article explores the digital transformation of the journal "Storia dell’Arte," particularly focusing on the impact of advanced language models and artificial intelligence. The digital transformation of "Storia dell’Arte" involved digitization of archival material, improved online presence, and enhanced content integration. The project’s milestones include launching a new website and establishing a comprehensive digital archive. This archive, consisting of digitized issues processed through OCR, provides both PDF and plain text formats for computational analysis using neural network models. A significant part of the project involved creating a detailed dataset of 1,050 articles from 160 issues, categorized by descriptive, quantitative, and qualitative metadata. This dataset facilitates interdisciplinary analysis and enhances accessibility through advanced NLP techniques. The document also discusses the technical challenges and solutions in creating embeddings for articles, using models like text-embedding-3-large and text-embedding-3-small from OpenAI. The dataset promotes interoperability with other digital resources and supports various applications in scholarly and research contexts. De Gasperis highlights the use of AI foundation models with the dataset, demonstrating its potential for semantic analysis and interdisciplinary research. One of the goals proposed with this solution is the definition of an effective strategy for the digital transformation of historical journals
Una rivista in digitale / DE GASPERIS, Paolo. - In: STORIA DELL'ARTE. - ISSN 0392-4513. - 161(2024), pp. 161-175.
Una rivista in digitale
Paolo De Gasperis
Primo
2024
Abstract
The article explores the digital transformation of the journal "Storia dell’Arte," particularly focusing on the impact of advanced language models and artificial intelligence. The digital transformation of "Storia dell’Arte" involved digitization of archival material, improved online presence, and enhanced content integration. The project’s milestones include launching a new website and establishing a comprehensive digital archive. This archive, consisting of digitized issues processed through OCR, provides both PDF and plain text formats for computational analysis using neural network models. A significant part of the project involved creating a detailed dataset of 1,050 articles from 160 issues, categorized by descriptive, quantitative, and qualitative metadata. This dataset facilitates interdisciplinary analysis and enhances accessibility through advanced NLP techniques. The document also discusses the technical challenges and solutions in creating embeddings for articles, using models like text-embedding-3-large and text-embedding-3-small from OpenAI. The dataset promotes interoperability with other digital resources and supports various applications in scholarly and research contexts. De Gasperis highlights the use of AI foundation models with the dataset, demonstrating its potential for semantic analysis and interdisciplinary research. One of the goals proposed with this solution is the definition of an effective strategy for the digital transformation of historical journalsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.