Proteins play a crucial role in biological processes, with their functions closely related to structure. Protein functions are often associated with specific motifs, which are short amino acid sequences exhibiting particular patterns. Most bioinformatics tools focus on identifying known motifs and they lack the ability to analyze the impact of single substitutions on entire domains or motifs. To address these limitations, we developed PMScanR (Protein Motif Scanner in R), an R package that automates the prediction and evaluation of the impact of single amino acid substitutions on protein motif occurrence in large datasets. In addition, existing tools do not support comparative analysis of multiple motifs across multiple sequences—a key feature that PMScanR was designed to provide. The package integrates various methods to facilitate motif identification, characterization, and visualization. It includes functions for running PS-Scan, a PROSITE database tool. Additionally, PMScanR supports format conversion to GFF, enhancing downstream analyses such as graphical representation and database integration. The library offers multiple visualization tools, including occurrence plots, sequence logos, and pie charts, enabling a deeper understanding of motif distribution and conservation. Through its integration with PROSITE, PMScanR provides access to up-to-date motif data, making it a valuable tool for biological and biomedical research, particularly in protein function annotation and therapeutic target identification. PMScanR is freely available under the GPL license and is distributed through Bioconductor (https://bioconductor.org/packages/PMScanR) and GitHub (https://github.com/prodakt/PMScanR).

PMScanR: An R Package for the Large-Scale Identification, Analysis, and Visualization of Protein Motifs / P Jastrzebski, Jan; Gawronska, Monika; Babis, Wiktor; Quaranta, Miriana; Czopek, Damian. - In: JOURNAL OF COMPUTATIONAL BIOLOGY. - ISSN 1066-5277. - (2026). [10.1177/15578666261423966]

PMScanR: An R Package for the Large-Scale Identification, Analysis, and Visualization of Protein Motifs

Miriana Quaranta;
2026

Abstract

Proteins play a crucial role in biological processes, with their functions closely related to structure. Protein functions are often associated with specific motifs, which are short amino acid sequences exhibiting particular patterns. Most bioinformatics tools focus on identifying known motifs and they lack the ability to analyze the impact of single substitutions on entire domains or motifs. To address these limitations, we developed PMScanR (Protein Motif Scanner in R), an R package that automates the prediction and evaluation of the impact of single amino acid substitutions on protein motif occurrence in large datasets. In addition, existing tools do not support comparative analysis of multiple motifs across multiple sequences—a key feature that PMScanR was designed to provide. The package integrates various methods to facilitate motif identification, characterization, and visualization. It includes functions for running PS-Scan, a PROSITE database tool. Additionally, PMScanR supports format conversion to GFF, enhancing downstream analyses such as graphical representation and database integration. The library offers multiple visualization tools, including occurrence plots, sequence logos, and pie charts, enabling a deeper understanding of motif distribution and conservation. Through its integration with PROSITE, PMScanR provides access to up-to-date motif data, making it a valuable tool for biological and biomedical research, particularly in protein function annotation and therapeutic target identification. PMScanR is freely available under the GPL license and is distributed through Bioconductor (https://bioconductor.org/packages/PMScanR) and GitHub (https://github.com/prodakt/PMScanR).
2026
PROSITE; R library; protein motifs; visualization
01 Pubblicazione su rivista::01a Articolo in rivista
PMScanR: An R Package for the Large-Scale Identification, Analysis, and Visualization of Protein Motifs / P Jastrzebski, Jan; Gawronska, Monika; Babis, Wiktor; Quaranta, Miriana; Czopek, Damian. - In: JOURNAL OF COMPUTATIONAL BIOLOGY. - ISSN 1066-5277. - (2026). [10.1177/15578666261423966]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1761290
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact