This paper reviews the global evolution of synthetic data (SD) generation in the field of genomic cancer medicine, with an analysis of research trends from the past decade. The use of artificial intelligence, particularly machine learning and deep learning techniques has transformed this area, providing solutions to overcome the limited availability of real clinical data. Through a bibliometric analysis of a wide sample of scientific articles from SCOPUS, this study highlights the adoption of SD generation techniques in oncological applications, focusing on major methodologies and challenges. Key application areas, such as multi-omics integration (genomics, transcriptomics, and proteomics) and tumor genomic heterogeneity, emerge as fields of growing interest. Despite noise management and performance optimization challenges, advanced machine learning techniques prove essential for generating high-quality SD that reflects biological complexity. The study also identifies key open challenges, such as simulation accuracy and noise control, offering insights into future applications of SD in personalized medicine and cancer therapy.

Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years / De Nicoló, Valentina; Frasca, Maria; Graziosi, Agnese; Gazzaniga, Gianluca; Torre, Davide La; Pani, Arianna. - In: DISCOVER ARTIFICIAL INTELLIGENCE. - ISSN 2731-0809. - 5:1(2025). [10.1007/s44163-025-00384-9]

Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years

Gazzaniga, Gianluca;
2025

Abstract

This paper reviews the global evolution of synthetic data (SD) generation in the field of genomic cancer medicine, with an analysis of research trends from the past decade. The use of artificial intelligence, particularly machine learning and deep learning techniques has transformed this area, providing solutions to overcome the limited availability of real clinical data. Through a bibliometric analysis of a wide sample of scientific articles from SCOPUS, this study highlights the adoption of SD generation techniques in oncological applications, focusing on major methodologies and challenges. Key application areas, such as multi-omics integration (genomics, transcriptomics, and proteomics) and tumor genomic heterogeneity, emerge as fields of growing interest. Despite noise management and performance optimization challenges, advanced machine learning techniques prove essential for generating high-quality SD that reflects biological complexity. The study also identifies key open challenges, such as simulation accuracy and noise control, offering insights into future applications of SD in personalized medicine and cancer therapy.
2025
Cancer research; Data privacy; Genomic medicine; Machine learning; Synthetic data
01 Pubblicazione su rivista::01g Articolo di rassegna (Review)
Synthetic data generation in genomic cancer medicine: a review of global research trends in the last ten years / De Nicoló, Valentina; Frasca, Maria; Graziosi, Agnese; Gazzaniga, Gianluca; Torre, Davide La; Pani, Arianna. - In: DISCOVER ARTIFICIAL INTELLIGENCE. - ISSN 2731-0809. - 5:1(2025). [10.1007/s44163-025-00384-9]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1743835
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact