SVModeler: Simulation of human haplotypes containing synthetic structural variants
Visualitza/Obre
Altres autors/es
Data de publicació
2024-09-19Resum
Abstract
Motivation: Long-read sequencing overcomes the limits of short-reads by providing longer sequences that can span
repetitive regions, improving the detection of structural variants and accurately resolving their sequences. Nonetheless,
detection and annotation of structural variants remains a computational challenge, requiring active development and
benchmarking of available algorithms. The current availability of detailed sequence information for large collections of
SVs identified using long-read sequencing presents a valuable opportunity for developing and training realistic novel
simulation frameworks, which can be used for the evaluation of SV callers.
Results: SVModeler is a newly developed computational tool to simulate synthetic human haplotypes containing embedded
SVs. A unique feature of SVModeler is its capability to leverage SV catalogs to model the genome-wide distribution,
frequency and sequence features of various SV classes, including tandem duplications, mobile elements and variable
number tandem repeats. As a proof of principle, SVModeler has been trained with a large catalog of polymorphic SVs
identified in a dataset comprising 1.019 samples from the 1000 Genomes Project, which represents the largest collection
of diverse humans sequenced with long reads to date.
Code availability: https://github.com/ismaelveramu/SVModeler
Contact: ismael.vera@uvic.cat
Tipus de document
Treball fi de màster
Llengua
Anglès
Paraules clau
Genòmica
Bioinformàtica
Mutació (Biologia)
Pàgines
10 p.
Nota
Curs 2023-2024
Aquest element apareix en la col·lecció o col·leccions següent(s)
Drets
Aquest document està subjecte a aquesta llicència Creative Commons
Excepte que s'indiqui una altra cosa, la llicència de l'ítem es descriu com https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ca