Hybridization capture is an emerging method making use of short oligonucleotide baits to enrich DNA libraries for genomic fragments of specific organisms thus enabling detection of their presence in environmental samples. Although it offers a primer-independent alternative to metabarcoding, little empirical work has been dedicated to characterizing the underlying biases and coupled implications for biological interpretation. Moreover, few published bioinformatic pipelines are available for designing polynucleotide capture baits from a reference sequence collection. We designed RNA-baits specifically targeting two chloroplast barcoding genes matK and rbcL to reveal the plant taxonomic diversity present in a given environmental sample. Our approach leverages the sensitivity of hybridization capture and the capacity of high-throughput DNA sequencing instruments. It builds on a new and universal method based on ancestral sequence reconstruction, ultimately limiting the number of bait-probes required and reducing experimental costs, while accessing high levels of taxonomic diversity. Our bait-set selectively targets four main plant orders (Fagales, Pinales, Asterales, and Poales), representing ~18% of all described vascular plants. This is achieved through the use of only 4084 baits, each 80 nucleotides in length (80-mer), capturing ~1.0–1.6 k nucleotide sequences from each taxon. Tests on mock communities revealed important factors influencing capture efficiency and relative abundance estimates, including GC-content, the overall target length per taxa, and the bait density and mean number of mismatches to the bait sequence. Our results show that hybridization capture, like metabarcoding, requires caution when interpreting results quantitatively within (paleo)-ecological studies. Biases detected in this work have the potential to be mitigated with bait designs that avoid extreme base compositional biases and balancing bait targets across taxa. However, we strongly recommend the use of mock communities and read simulations to quantify the accuracy of taxonomic representation when using new bait designs.

Enriching barcoding markers in environmental samples utilizing a phylogenetic probe design: insights from mock communities / Nota, K.; Orlando, L.; Marchesini, A.; Girardi, M.; Bertilsson, S.; Vernesi, C.; Parducci, L.. - In: ENVIRONMENTAL DNA. - ISSN 2637-4943. - 6:4(2024). [10.1002/edn3.593]

Enriching barcoding markers in environmental samples utilizing a phylogenetic probe design: insights from mock communities

Parducci L.
Ultimo
Conceptualization
2024

Abstract

Hybridization capture is an emerging method making use of short oligonucleotide baits to enrich DNA libraries for genomic fragments of specific organisms thus enabling detection of their presence in environmental samples. Although it offers a primer-independent alternative to metabarcoding, little empirical work has been dedicated to characterizing the underlying biases and coupled implications for biological interpretation. Moreover, few published bioinformatic pipelines are available for designing polynucleotide capture baits from a reference sequence collection. We designed RNA-baits specifically targeting two chloroplast barcoding genes matK and rbcL to reveal the plant taxonomic diversity present in a given environmental sample. Our approach leverages the sensitivity of hybridization capture and the capacity of high-throughput DNA sequencing instruments. It builds on a new and universal method based on ancestral sequence reconstruction, ultimately limiting the number of bait-probes required and reducing experimental costs, while accessing high levels of taxonomic diversity. Our bait-set selectively targets four main plant orders (Fagales, Pinales, Asterales, and Poales), representing ~18% of all described vascular plants. This is achieved through the use of only 4084 baits, each 80 nucleotides in length (80-mer), capturing ~1.0–1.6 k nucleotide sequences from each taxon. Tests on mock communities revealed important factors influencing capture efficiency and relative abundance estimates, including GC-content, the overall target length per taxa, and the bait density and mean number of mismatches to the bait sequence. Our results show that hybridization capture, like metabarcoding, requires caution when interpreting results quantitatively within (paleo)-ecological studies. Biases detected in this work have the potential to be mitigated with bait designs that avoid extreme base compositional biases and balancing bait targets across taxa. However, we strongly recommend the use of mock communities and read simulations to quantify the accuracy of taxonomic representation when using new bait designs.
2024
capture bias; DNA barcoding; hybridization capture; shotgun metagenomics; target capture; target enrichment
01 Pubblicazione su rivista::01a Articolo in rivista
Enriching barcoding markers in environmental samples utilizing a phylogenetic probe design: insights from mock communities / Nota, K.; Orlando, L.; Marchesini, A.; Girardi, M.; Bertilsson, S.; Vernesi, C.; Parducci, L.. - In: ENVIRONMENTAL DNA. - ISSN 2637-4943. - 6:4(2024). [10.1002/edn3.593]
File allegati a questo prodotto
File Dimensione Formato  
Nota_Enriching-barcoding-markers_2024.pdf

accesso aperto

Note: Articolo in rivista
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 2.17 MB
Formato Adobe PDF
2.17 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1720200
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact