Show simple item record

dc.contributorUniversitat de Vic - Universitat Central de Catalunya. Facultat de Ciències i Tecnologia
dc.contributorUniversitat de Vic - Universitat Central de Catalunya. Màster Universitari en Anàlisi de Dades Òmiques
dc.contributor.authorLorenzo Salazar, José Miguel
dc.date.accessioned2016-03-03T10:44:29Z
dc.date.available2016-03-03T10:44:29Z
dc.date.created2015-09
dc.date.issued2015-09
dc.identifier.urihttp://hdl.handle.net/10854/4443
dc.descriptionCurs 2014-2015ca_ES
dc.description.abstractA complete bioinformatics pipeline for Next Generation Sequencing (NGS) analysis has been developed and applied to study the association of called variants with susceptibility in Idiopathic Pulmonary Fibrosis (IPF). This bioinformatics pipeline integrates the Genome Analysis Toolkit (GATK) with state-of-the-art bioinformatics tools such as quality control reporters, aligners, alternative callers (i.e. Platypus), annotators, and auxiliary tools. The pipeline executes a sequence of SBash and Bash shell scripts by queuing the programmed jobs to a SLURM queue at a cluster server provided by La Laguna University (ULL). It is also executable with a local Linux machine. We tested the pipeline by calling single nucleotide polymorphisms (SNPs) in targeted NGS data from 192 individuals with IPF, where 16,253 variant sites were identified. The call concordance between the two utilized callers (GATK and Platypus) was estimated at 77.8% when we compared matching overlapping sites. With this data, an association study following an unmatched case-control design was performed using unrelated European individuals (n=501) from The 1000 Genomes Project as controls. Logistic regression models were applied to the phenotype trait using genotypes from the 10,245 SNPs with call rates >95%, adjusting with five principal components to account for population stratification. Despite the reduced sample size, we identified 38 variants reaching genome-wide significance (p<5x10-8), including one previously identified in the promoter region of MUC5B gene (rs35705950), and several other novel susceptibility variants.ca_ES
dc.formatapplication/pdf
dc.format.extent89 p.ca_ES
dc.language.isoengca_ES
dc.rightsAquest document està subjecte a aquesta llicència Creative Commonsca_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/ca_ES
dc.subject.otherFibrosi pulmonarca_ES
dc.subject.otherBioinformàticaca_ES
dc.titleBioinformatics Pipeline for Next Generation Sequencing Analysis in Association Studies of Idiopathic Pulmonary Fibrosisca_ES
dc.typeinfo:eu-repo/semantics/masterThesisca_ES
dc.description.versionDirector/a: Carlos Flores Infante
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_ES


Files in this item

 

This item appears in the following Collection(s)

Show simple item record

Aquest document està subjecte a aquesta llicència Creative Commons
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc-nd/3.0/es/
Share on TwitterShare on LinkedinShare on FacebookShare on TelegramShare on WhatsappPrint