dc.contributor | Universitat de Vic - Universitat Central de Catalunya. Màster Universitari en Anàlisi de Dades Òmiques | |
dc.contributor | Universitat de Vic - Universitat Central de Catalunya. Facultat de Ciències i Tecnologia | |
dc.contributor.author | González García, Jorge | |
dc.date.accessioned | 2024-01-31T12:44:03Z | |
dc.date.available | 2024-01-31T12:44:03Z | |
dc.date.created | 2023-09-10 | |
dc.date.issued | 2023-09-10 | |
dc.identifier.uri | http://hdl.handle.net/10854/7719 | |
dc.description | Curs 2022-2023 | es |
dc.description.abstract | Over the past decade, increased computational capabilities have enabled us to address biological questions using data-driven methods, particularly where traditional techniques have been limiting. We hypothesize these computer-based methods can be used to predict enzyme activation status. To verify this claim, we have selected a benchmark protein for study, Human HRAS, sourcing a comprehensive set of experimentally labelled structures from available databases. Seven physical, computationally inexpensive features were extracted from these structures at the amino acid alpha carbon level and aligned to the canonical sequence to convey their metrics locally. Subsequently, three-dimensional tensors were generated with them from the set of all possible combinations of the obtained features. A random forest model was then trained on t-SNE preprocessed tensors to look for the highest performing combination of features. Our results strongly suggest that activation status in Human HRAS, and probably other proteins, is mainly codified in the electrostatic and Van der Waals forces, with solvation forces playing a lesser role. These forces, when processed through machine learning models, offer substantial predictive capability. In contrast, methods based on the physical three-dimensional position of residues, such as coordinate-based data and pairwise Root Mean Standard Deviation, were not independently effective in distinguishing activation states. | es |
dc.format | application/pdf | es |
dc.format.extent | 41 p. | es |
dc.language.iso | eng | es |
dc.rights | Aquest document està subjecte a aquesta llicència Creative Commons | es |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ca | es |
dc.subject.other | Proteïnes -- Investigació | es |
dc.title | HRAS physical feature analysis: Predicting protein activation through a random forest classifier | es |
dc.type | info:eu-repo/semantics/masterThesis | es |
dc.description.version | Academic tutor: Jordi Villà i Freixa. | |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |