The study of rare variants in next generation sequencing (NGS) experiments enables the detection of causative mutations in the human genome. NGS is a relatively new approach for biomedical research, useful for the genetic diagnosis in extremely heterogeneous conditions. Nevertheless, only few publications address the problem when pooled experiments are considered and existing tools are often inaccurate. We describe how data generated by high-throughput NGS experiments are aligned and filtered. We describe de facto standard techniques and their organization in the pre-processing phase. We will show how to detect rare single nucleotide polymorphism by filtering and constructing features employed in the learning phase. Then, we focus on a supervised learning approach in order to obtain new knowledge about genomic variations in human diseases and we compare different computational procedures to identify and classify these variants.
Supervised classification for rare variant calling in next generation sequencing pooled experiments / Guarracino, Mario Rosario; Ferraro, MARIA BRIGIDA. - (2013), pp. 65-65. (Intervento presentato al convegno 6th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (ERCIM 2013) tenutosi a London nel 14-16 December 2013).
Supervised classification for rare variant calling in next generation sequencing pooled experiments
FERRARO, MARIA BRIGIDA
2013
Abstract
The study of rare variants in next generation sequencing (NGS) experiments enables the detection of causative mutations in the human genome. NGS is a relatively new approach for biomedical research, useful for the genetic diagnosis in extremely heterogeneous conditions. Nevertheless, only few publications address the problem when pooled experiments are considered and existing tools are often inaccurate. We describe how data generated by high-throughput NGS experiments are aligned and filtered. We describe de facto standard techniques and their organization in the pre-processing phase. We will show how to detect rare single nucleotide polymorphism by filtering and constructing features employed in the learning phase. Then, we focus on a supervised learning approach in order to obtain new knowledge about genomic variations in human diseases and we compare different computational procedures to identify and classify these variants.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.