The estimation of species diversity of ecological communities relies on surveying species abundances, that is, counting the number of units by species in a sample. Diversity estimators are particularly sensitive to rare species, that is, to low abundance cases. In microbial studies, rare species,in particular singletons, often represent the vast majority of the specimens in a sample. Many studies hypothesize the spurious nature of these cases, and various methodological contributions focus on estimating and eliminating the spurious singletons to avoid a gross overestimation of the total diversity of a community. We present a different approach that treats the spurious singletons as the result of false negative errors in the clustering step of the RNA sequencing. We demonstrate that the estimation of the total number of species under our scenario is equivalent to that one can obtain by discarding spurious cases. On the converse, diversity as measured by Shannon’s index for example, can differ considerably. The computation of such index requires to estimate all true abundances counts, which appears to be computationally challenging. We then propose a likelihood–free Bayesian approach to the problem.

Modeling linkage errors in species diversity estimates: an ABC approach / Di Cecco, D.; Tancredi, A.. - (2023). (Intervento presentato al convegno Graspa 2023 tenutosi a Palermo).

Modeling linkage errors in species diversity estimates: an ABC approach

D. Di Cecco
;
A. Tancredi
2023

Abstract

The estimation of species diversity of ecological communities relies on surveying species abundances, that is, counting the number of units by species in a sample. Diversity estimators are particularly sensitive to rare species, that is, to low abundance cases. In microbial studies, rare species,in particular singletons, often represent the vast majority of the specimens in a sample. Many studies hypothesize the spurious nature of these cases, and various methodological contributions focus on estimating and eliminating the spurious singletons to avoid a gross overestimation of the total diversity of a community. We present a different approach that treats the spurious singletons as the result of false negative errors in the clustering step of the RNA sequencing. We demonstrate that the estimation of the total number of species under our scenario is equivalent to that one can obtain by discarding spurious cases. On the converse, diversity as measured by Shannon’s index for example, can differ considerably. The computation of such index requires to estimate all true abundances counts, which appears to be computationally challenging. We then propose a likelihood–free Bayesian approach to the problem.
2023
Graspa 2023
microbial diversity; sequencing errors; linkage errors; approximate bayesian computation
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Modeling linkage errors in species diversity estimates: an ABC approach / Di Cecco, D.; Tancredi, A.. - (2023). (Intervento presentato al convegno Graspa 2023 tenutosi a Palermo).
File allegati a questo prodotto
File Dimensione Formato  
DiCecco_Modeling-linkage_2023.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 796.88 kB
Formato Adobe PDF
796.88 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1685004
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact