README.md 5.96 KB
Newer Older
Lucia Prieto's avatar
Lucia Prieto committed
1 2 3 4 5 6 7
# PhD Thesis - Lucía Prieto Santamaría (2023)

This repository brings together the resources, materials and code generated for and in relation to the doctoral thesis of [Lucía Prieto Santamaría](https://luciaprietosantamaria.es/).

The title of the thesis is: 

---
Lucia Prieto's avatar
Lucia Prieto committed
8

Lucia Prieto's avatar
Lucia Prieto committed
9
> Creation, integration, and analysis of disease networks towards a better disease understanding and drug repurposing
Lucia Prieto's avatar
Lucia Prieto committed
10

Lucia Prieto's avatar
Lucia Prieto committed
11 12
---

Lucia Prieto's avatar
Lucia Prieto committed
13
<br />
Lucia Prieto's avatar
Lucia Prieto committed
14 15

The supervisors of this thesis have been [Alejandro Rodríguez González](https://www.alejandrorg.com/) and [Yuliana Pérez Gallardo](https://scholar.google.com/citations?user=1bF8dTwAAAAJ&hl=en). The doctoral program in which the thesis has been carried out is the [PhD. in Software, Systems and Computing](https://dssc.fi.upm.es/), which takes place at the Escuela Técnica Superior de Ingenieros Informáticos ([ETSIInf](https://www.etsiinf.upm.es/)) of the Universidad Politécnica de Madrid ([UPM](https://www.upm.es/)), Spain.
Lucia Prieto's avatar
Lucia Prieto committed
16 17 18

The thesis has been developed under the scope of the [DISNET project](https://disnet.ctb.upm.es/), executed at the Medical Data Analytics Laboratory ([MEDAL](https://medal.ctb.upm.es/)) of the Center for Biomedical Technology ([CTB](http://www.ctb.upm.es/)). The thesis is a result of an industrial doctorate in conjunction with [Ezeris Networks Global Services, S.L.](http://www.ezeris.com/)

Lucia Prieto's avatar
Lucia Prieto committed
19 20
<br />

Lucia Prieto's avatar
Lucia Prieto committed
21 22
The thesis is presented as a compendium of 3 publications:

Lucia Prieto's avatar
Lucia Prieto committed
23
* [Publication I](https://medal.ctb.upm.es/internal/gitlab/lprieto/phd-thesis-luciaprietosantamaria/tree/master/Publication%20I): Classifying diseases by using biological features to identify potential nosological models.
Lucia Prieto's avatar
Lucia Prieto committed
24

Lucia Prieto's avatar
Lucia Prieto committed
25
* [Publication II](https://medal.ctb.upm.es/internal/gitlab/lprieto/phd-thesis-luciaprietosantamaria/tree/master/Publication%20II): A data-driven methodology towards evaluating the potential of drug repurposing hypotheses.
Lucia Prieto's avatar
Lucia Prieto committed
26

Lucia Prieto's avatar
Lucia Prieto committed
27
* [Publication III](https://medal.ctb.upm.es/internal/gitlab/lprieto/phd-thesis-luciaprietosantamaria/tree/master/Publication%20III): Integrating heterogeneous data to facilitate COVID-19 drug repurposing.
Lucia Prieto's avatar
Lucia Prieto committed
28

Lucia Prieto's avatar
Lucia Prieto committed
29
<br />
Lucia Prieto's avatar
Lucia Prieto committed
30 31 32

|Abstract (en)| 
|:-| 
Lucia Prieto's avatar
Lucia Prieto committed
33
|<br />Developing a drug for a specific condition is a remarkably costly task in terms of money, time, and risks. An alternative approach to this lengthy process is drug repurposing, which tries to identify other uses for drugs that already exist. The problem is addressed by using already known drugs to treat other diseases different from the ones they were developed for. This way, some of the phases of developing a drug can be skipped, being more efficient and reducing the investment. Although this process used to occur by chance in the beginning, nowadays drug repurposing can be targeted. Some of the most promising strategies derive from data-driven methodologies. In this context, one of the emerging paradigms to structure enormous amounts of biomedical data comes with the so-called “network medicine”. This field abandons the individual study of each disease, integrating large-scale and heterogenous data in the form of graphs to achieve a better understanding of how diseases work and how they are connected. In particular, a disease network is a complex network in which the nodes are the different diseases or disorders, while the edges represent the relationships among them. The first human disease network was based on disease-causing genes, but other networks have been designed around different factors such as metabolic pathways, drugs, or symptoms, among others.<br /><br />Following the ideas and concepts of disease networks, one can find the DISNET project, which had the ultimate goal of dug repurposing. The present doctoral thesis has been developed under the scope of this project, pursuing the general objective of obtaining and integrating biomedical knowledge from public sources to create disease networks that enable a better disease understanding and, ultimately, enhance drug repurposing. Within this thesis and the DISNET project, a large-scale multi-layered heterogeneous biomedical knowledge base around the concept of disease has been built. The data has been obtained from publicly accessible sources, both structured and unstructured. This information has been integrated and organized in three different levels: the phenotypic (with information regarding diseases and their associated symptoms), the biologic (which stores molecular-shifted data related to diseases including genes, proteins, metabolic pathways, genetic variants, non-coding RNAs and so on) and the pharmacologic (containing information of the drugs, their interactions, and their connections to diseases).<br /><br />The two main lines in which the present thesis has delved into are disease understanding and drug repurposing. On the one hand and regarding disease understanding, a set of new arrangements of disease groups has been proposed via clustering techniques. These groups can be thought as novel nosological models that integrate molecular information, in contrast with traditional taxonomies mostly relying on solely phenotypic data. On the other hand and with respect to drug repurposing, two complementary methodologies to repurpose drugs have been put forward. Differences between data related to known successful repurposing cases and non-repurposing data have been identified, and analyses within the genes, symptoms and categories have been performed to uncover patterns. Threshold values in the association scores between diseases and different features have been pinpointed in order to evaluate the potential of new repurposing hypotheses. Moreover, a straightforward methodology consisting of five information paths and based on gene and symptom relationships has been developed to suggest repurposing candidates to treat COVID-19. A list of 13 drugs was obtained.<br /><br />This doctoral thesis has followed the structure of a compendium of publications: it comprises three articles published in scientific journals with high impact factor. The publications have contributed to accomplish the different research objectives and have been developed under the same thematic unit.<br />|