Background: The availability of transcriptomic data for species without a reference genome enables the construction of de novo transcriptome assemblies as alternative reference resources from RNA-Seq data. A transcriptome provides direct information about a species’ protein-coding genes under specific experimental conditions. The de novo assembly process produces a unigenes file in FASTA format, subsequently targeted for the annotation. Homology-based annotation, a method to infer the function of sequences by estimating similarity with other sequences in a reference database, is a computationally demanding procedure. Results: To mitigate the computational burden, we introduce HPC-T-Annotator, a tool for de novo transcriptome homology annotation on high performance computing (HPC) infrastructures, designed for straightforward configuration via a Web interface. Once the configuration data are given, the entire parallel computing software for annotation is automatically generated and can be launched on a supercomputer using a simple command line. The output data can then be easily viewed using post-processing utilities in the form of Python notebooks integrated in the proposed software. Conclusions: HPC-T-Annotator expedites homology-based annotation in de novo transcriptome assemblies. Its efficient parallelization strategy on HPC infrastructures significantly reduces computational load and execution times, enabling large-scale transcriptome analysis and comparison projects, while its intuitive graphical interface extends accessibility to users without IT skills.

HPC-T-Annotator: an HPC tool for de novo transcriptome assembly annotation / Arcioni, L.; Arcieri, M.; Di Martino, J.; Liberati, F.; Bottoni, P.; Castrignano, T.. - In: BMC BIOINFORMATICS. - ISSN 1471-2105. - 25:1(2024). [10.1186/s12859-024-05887-3]

HPC-T-Annotator: an HPC tool for de novo transcriptome assembly annotation

Arcioni L.;Arcieri M.;Bottoni P.;
2024

Abstract

Background: The availability of transcriptomic data for species without a reference genome enables the construction of de novo transcriptome assemblies as alternative reference resources from RNA-Seq data. A transcriptome provides direct information about a species’ protein-coding genes under specific experimental conditions. The de novo assembly process produces a unigenes file in FASTA format, subsequently targeted for the annotation. Homology-based annotation, a method to infer the function of sequences by estimating similarity with other sequences in a reference database, is a computationally demanding procedure. Results: To mitigate the computational burden, we introduce HPC-T-Annotator, a tool for de novo transcriptome homology annotation on high performance computing (HPC) infrastructures, designed for straightforward configuration via a Web interface. Once the configuration data are given, the entire parallel computing software for annotation is automatically generated and can be launched on a supercomputer using a simple command line. The output data can then be easily viewed using post-processing utilities in the form of Python notebooks integrated in the proposed software. Conclusions: HPC-T-Annotator expedites homology-based annotation in de novo transcriptome assemblies. Its efficient parallelization strategy on HPC infrastructures significantly reduces computational load and execution times, enabling large-scale transcriptome analysis and comparison projects, while its intuitive graphical interface extends accessibility to users without IT skills.
2024
Bioinformatics; Data-parallelism algorithm; High performance computing; Transcript annotation
01 Pubblicazione su rivista::01a Articolo in rivista
HPC-T-Annotator: an HPC tool for de novo transcriptome assembly annotation / Arcioni, L.; Arcieri, M.; Di Martino, J.; Liberati, F.; Bottoni, P.; Castrignano, T.. - In: BMC BIOINFORMATICS. - ISSN 1471-2105. - 25:1(2024). [10.1186/s12859-024-05887-3]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1745401
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 4
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact