Progesterone receptors (PRs) are implicated in various cancers since their presence/absence can determine clinical outcomes. The overstimulation of progesterone can facilitate oncogenesis and thus, its modulation through PR inhibition is urgently needed. To address this issue, a novel stacked ensemble learning approach (termed StackPR) is presented for fast, accurate, and large-scale identification of PR antagonists using only SMILES notation without the need for 3D structural information. We employed six popular machine learning (ML) algorithms (i.e., logistic regression, partial least squares, k-nearest neighbor, support vector machine, extremely randomized trees, and random forest) coupled with twelve conventional molecular descriptors to create 72 baseline models. Then, a genetic algorithm in conjunction with the self-assessment-report approach was utilized to determine m out of the 72 baseline models as means of developing the final meta-predictor using the stacking strategy and tenfold cross-validation test. Experimental results on the independent test dataset show that StackPR achieved impressive predictive performance with an accuracy of 0.966 and Matthew’s coefficient correlation of 0.925. In addition, analysis based on the SHapley Additive exPlanation algorithm and molecular docking indicates that aliphatic hydrocarbons and nitrogen-containing substructures were the most important features for having PR antagonist activity. Finally, we implemented an online webserver using StackPR, which is freely accessible at http://pmlabstack.pythonanywhere.com/StackPR. StackPR is anticipated to be a powerful computational tool for the large-scale identification of unknown PR antagonist candidates for follow-up experimental validation.

StackPR is a new computational approach for large-scale identification of progesterone receptor antagonists using the stacking strategy / Schaduangrat, N.; Anuwongcharoen, N.; Moni, M. A.; Lio, P.; Charoenkwan, P.; Shoombuatong, W.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 12:1(2022). [10.1038/s41598-022-20143-5]

StackPR is a new computational approach for large-scale identification of progesterone receptor antagonists using the stacking strategy

Lio P.;
2022

Abstract

Progesterone receptors (PRs) are implicated in various cancers since their presence/absence can determine clinical outcomes. The overstimulation of progesterone can facilitate oncogenesis and thus, its modulation through PR inhibition is urgently needed. To address this issue, a novel stacked ensemble learning approach (termed StackPR) is presented for fast, accurate, and large-scale identification of PR antagonists using only SMILES notation without the need for 3D structural information. We employed six popular machine learning (ML) algorithms (i.e., logistic regression, partial least squares, k-nearest neighbor, support vector machine, extremely randomized trees, and random forest) coupled with twelve conventional molecular descriptors to create 72 baseline models. Then, a genetic algorithm in conjunction with the self-assessment-report approach was utilized to determine m out of the 72 baseline models as means of developing the final meta-predictor using the stacking strategy and tenfold cross-validation test. Experimental results on the independent test dataset show that StackPR achieved impressive predictive performance with an accuracy of 0.966 and Matthew’s coefficient correlation of 0.925. In addition, analysis based on the SHapley Additive exPlanation algorithm and molecular docking indicates that aliphatic hydrocarbons and nitrogen-containing substructures were the most important features for having PR antagonist activity. Finally, we implemented an online webserver using StackPR, which is freely accessible at http://pmlabstack.pythonanywhere.com/StackPR. StackPR is anticipated to be a powerful computational tool for the large-scale identification of unknown PR antagonist candidates for follow-up experimental validation.
2022
Algorithms; Computational Biology; Molecular Docking Simulation; Nitrogen; Progesterone; Receptors, Progesterone; Support Vector Machine
01 Pubblicazione su rivista::01a Articolo in rivista
StackPR is a new computational approach for large-scale identification of progesterone receptor antagonists using the stacking strategy / Schaduangrat, N.; Anuwongcharoen, N.; Moni, M. A.; Lio, P.; Charoenkwan, P.; Shoombuatong, W.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 12:1(2022). [10.1038/s41598-022-20143-5]
File allegati a questo prodotto
File Dimensione Formato  
Schaduangrat_StackPR_2022.pdf

accesso aperto

Note: https://www.nature.com/articles/s41598-022-20143-5.pdf
Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Creative commons
Dimensione 4.58 MB
Formato Adobe PDF
4.58 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1721272
Citazioni
  • ???jsp.display-item.citation.pmc??? 6
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 8
social impact