SVModeler: Simulation of human haplotypes containing synthetic structural variants
Otros/as autores/as
Fecha de publicación
2024-09-19Resumen
Abstract
Motivation: Long-read sequencing overcomes the limits of short-reads by providing longer sequences that can span
repetitive regions, improving the detection of structural variants and accurately resolving their sequences. Nonetheless,
detection and annotation of structural variants remains a computational challenge, requiring active development and
benchmarking of available algorithms. The current availability of detailed sequence information for large collections of
SVs identified using long-read sequencing presents a valuable opportunity for developing and training realistic novel
simulation frameworks, which can be used for the evaluation of SV callers.
Results: SVModeler is a newly developed computational tool to simulate synthetic human haplotypes containing embedded
SVs. A unique feature of SVModeler is its capability to leverage SV catalogs to model the genome-wide distribution,
frequency and sequence features of various SV classes, including tandem duplications, mobile elements and variable
number tandem repeats. As a proof of principle, SVModeler has been trained with a large catalog of polymorphic SVs
identified in a dataset comprising 1.019 samples from the 1000 Genomes Project, which represents the largest collection
of diverse humans sequenced with long reads to date.
Code availability: https://github.com/ismaelveramu/SVModeler
Contact: ismael.vera@uvic.cat
Tipo de documento
Trabajo fin de máster
Lengua
Inglés
Palabras clave
Genòmica
Bioinformàtica
Mutació (Biologia)
Páginas
10 p.
Nota
Curs 2023-2024
Este ítem aparece en la(s) siguiente(s) colección(ones)
Derechos
Aquest document està subjecte a aquesta llicència Creative Commons
Excepto si se señala otra cosa, la licencia del ítem se describe como https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ca