Background: Artificial intelligence is contributing to improve different medicine areas including clinical trial design. One field that holds a great potential is represented by the use of digital data as an alternative to real ones. The generation of a virtual cohort of patients might be advantageous since an artificial group of patients resembles the real dataset in all the key features but it does not include any identifiable real-patient data, tackling - by a privacy standpoint – the “burden” of collecting data subjects’ consent as well as the shortcomings of common anonymization techniques. Aims: To test the feasibility of this approach and evaluate its potential in clinical trial design, we built an in-silico cohort based on the large dataset of patients enrolled in the GIMEMA AML1310 study (Venditti et al. 2019), which entailed a “3 + 7”-like induction and a risk-adapted, MRD-directed post-remission transplant allocation. Methods: To create the synthetic cohort of patients, a machine learning generative model was constructed from the real individual-level data of the AML1310 study, capturing its patterns and statistical properties. AML1310 enrolled 500 patients (median age 49 years old) in 55 GIMEMA Institutions. All patients were NCCN2009 risk classified and analyzed by morphology, cytogenetics, molecular biology and multiparametric flow cytometry. The subset of 445 patients with ELN2017 risk classification available was used. To this purpose, the R package “synthpop” was used considering a parametric method: for binary data the logistic regression, for a factor with > 2 levels the polytomous logistic regression, for an ordered factor with > 2 levels the ordered polytomous logistic regression. For time to event variables the classification and regression trees method was used. Next, we verified the adherence of the virtual cohort to the original one in terms of age, gender, PS, WBC count, FLT3 and NPM1 mutations, risk category, CR achievement, MRD, transplant rate. Virtual and real cohorts were also compared in terms of survival outcomes. Results: By using the real-patient dataset from the AML1310 trial, a virtual cohort of 850 patients, named synthAML1310, was generated. By comparing the two cohorts, we observed that the clinico-biological characteristics and response evaluations (CR and MRD rates) did not differ significantly. Moreover, as depicted in Figure 1, the curves of OS and DFS were superimposable. Indeed, at 2 years, OS was 57% (52.5%-61.9%) in the original and 59.1% (55.9%-62.6%) in the synthAML1310 cohort. DFS was 55.1% (49.8%-60.9%) in the original and 55.1% (51.3%-59.2%) in the synthetic cohort. Summary/Conclusion: These results demonstrate the success of this approach in producing a virtual dataset that perfectly mimics the original and that, from a “privacy by design” perspective, minimizes the risk of re-identification of patients. Mirroring an AML population treated with a conventional chemotherapeutic approach, synthAML1310 is suitable to represent the control group when testing novel innovative treatments, most likely in an in-silico randomized trial, but also in other settings like propensity score matching analyses in observational studies. Shifting to an in-silico trial would overcome the challenges of randomized trials and it would be beneficial also for patients. since, they would receive only the experimental treatment without being exposed to the “less active“ therapy, thus limiting treatment failures and toxicity. Also, enrolment and the attainment of final results would be faster.

P495: UNLOCKING THE POTENTIAL OF SYNTHETIC PATIENTS FOR ACCELERATING CLINICAL TRIALS: RESULTS OF THE FIRST GIMEMA EXPERIENCE / Piciocchi, Alfonso; Cipriani, Marta; Messina, Monica; Marconi, Giovanni; Arena, Valentina; Soddu, Stefano; Crea, Enrico; Valeria Feraco, Maria; Ferrante, Marco; la Sala, Edoardo; Fazi, Paola; Buccisano, Francesco; Martinelli, Giovanni; Venditti, Adriano; Vignetti, Marco. - In: HEMASPHERE. - ISSN 2572-9241. - 7:S3(2023), pp. 1-3. (Intervento presentato al convegno European Hematology Association (EHA) 2023 Congress tenutosi a Frankfurt; Germany) [10.1097/01.HS9.0000968888.45365.36].

P495: UNLOCKING THE POTENTIAL OF SYNTHETIC PATIENTS FOR ACCELERATING CLINICAL TRIALS: RESULTS OF THE FIRST GIMEMA EXPERIENCE

Piciocchi, Alfonso;Cipriani, Marta;Messina, Monica;Marconi, Giovanni;Arena, Valentina;Fazi, Paola;Vignetti, Marco
2023

Abstract

Background: Artificial intelligence is contributing to improve different medicine areas including clinical trial design. One field that holds a great potential is represented by the use of digital data as an alternative to real ones. The generation of a virtual cohort of patients might be advantageous since an artificial group of patients resembles the real dataset in all the key features but it does not include any identifiable real-patient data, tackling - by a privacy standpoint – the “burden” of collecting data subjects’ consent as well as the shortcomings of common anonymization techniques. Aims: To test the feasibility of this approach and evaluate its potential in clinical trial design, we built an in-silico cohort based on the large dataset of patients enrolled in the GIMEMA AML1310 study (Venditti et al. 2019), which entailed a “3 + 7”-like induction and a risk-adapted, MRD-directed post-remission transplant allocation. Methods: To create the synthetic cohort of patients, a machine learning generative model was constructed from the real individual-level data of the AML1310 study, capturing its patterns and statistical properties. AML1310 enrolled 500 patients (median age 49 years old) in 55 GIMEMA Institutions. All patients were NCCN2009 risk classified and analyzed by morphology, cytogenetics, molecular biology and multiparametric flow cytometry. The subset of 445 patients with ELN2017 risk classification available was used. To this purpose, the R package “synthpop” was used considering a parametric method: for binary data the logistic regression, for a factor with > 2 levels the polytomous logistic regression, for an ordered factor with > 2 levels the ordered polytomous logistic regression. For time to event variables the classification and regression trees method was used. Next, we verified the adherence of the virtual cohort to the original one in terms of age, gender, PS, WBC count, FLT3 and NPM1 mutations, risk category, CR achievement, MRD, transplant rate. Virtual and real cohorts were also compared in terms of survival outcomes. Results: By using the real-patient dataset from the AML1310 trial, a virtual cohort of 850 patients, named synthAML1310, was generated. By comparing the two cohorts, we observed that the clinico-biological characteristics and response evaluations (CR and MRD rates) did not differ significantly. Moreover, as depicted in Figure 1, the curves of OS and DFS were superimposable. Indeed, at 2 years, OS was 57% (52.5%-61.9%) in the original and 59.1% (55.9%-62.6%) in the synthAML1310 cohort. DFS was 55.1% (49.8%-60.9%) in the original and 55.1% (51.3%-59.2%) in the synthetic cohort. Summary/Conclusion: These results demonstrate the success of this approach in producing a virtual dataset that perfectly mimics the original and that, from a “privacy by design” perspective, minimizes the risk of re-identification of patients. Mirroring an AML population treated with a conventional chemotherapeutic approach, synthAML1310 is suitable to represent the control group when testing novel innovative treatments, most likely in an in-silico randomized trial, but also in other settings like propensity score matching analyses in observational studies. Shifting to an in-silico trial would overcome the challenges of randomized trials and it would be beneficial also for patients. since, they would receive only the experimental treatment without being exposed to the “less active“ therapy, thus limiting treatment failures and toxicity. Also, enrolment and the attainment of final results would be faster.
2023
European Hematology Association (EHA) 2023 Congress
clinical data; artificial intelligence; acute myeloid leukemia; clinical trial
04 Pubblicazione in atti di convegno::04c Atto di convegno in rivista
P495: UNLOCKING THE POTENTIAL OF SYNTHETIC PATIENTS FOR ACCELERATING CLINICAL TRIALS: RESULTS OF THE FIRST GIMEMA EXPERIENCE / Piciocchi, Alfonso; Cipriani, Marta; Messina, Monica; Marconi, Giovanni; Arena, Valentina; Soddu, Stefano; Crea, Enrico; Valeria Feraco, Maria; Ferrante, Marco; la Sala, Edoardo; Fazi, Paola; Buccisano, Francesco; Martinelli, Giovanni; Venditti, Adriano; Vignetti, Marco. - In: HEMASPHERE. - ISSN 2572-9241. - 7:S3(2023), pp. 1-3. (Intervento presentato al convegno European Hematology Association (EHA) 2023 Congress tenutosi a Frankfurt; Germany) [10.1097/01.HS9.0000968888.45365.36].
File allegati a questo prodotto
File Dimensione Formato  
Piciocchi_P495-unlocking_2023.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 2.32 MB
Formato Adobe PDF
2.32 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1689338
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact